Figure: Example of a system in operation, with avatars, chat and a "Smart Ring" controller.

"Watching Together: Creating seamless shared social experiences between remote users watching television"
(Download paper here)
Abstract
Watching television began as a shared experience, but it has become a solitary pursuit. However, the desire to enjoy a favourite show in the company of others still exists. Whilst advances in technology has enabled niche content and video games to become shared social experiences, television has been left behind. This paper proposes a seamless system that could connect a group of people, separated by distance, to enjoyed watching a television programme together.
Introduction
Today, anyone can watch their favourite TV show anywhere, and it’s likely to be something done alone. Half of the nation’s viewing is done in the bedroom, 16% in the kitchen, and 16% on the daily commute. Television sets are inexpensive and on-demand programming can be viewed on a multitude of ubiquitous devices, resulting in only 30% of UK viewers saying that they watch programmes together every day [15].
This was not always the case. Television watching used to be a communal experience that brought the family together. In November 1936 at the launch of the BBC’s first television service the first set cost the equivalent of £11,000 in today’s money. And, with the scarcity of programming, viewing tended to be in a family gathering and in a single room [26] [27].
There continues to be a desire for the shared viewing experience, even if it is un-met. 58% of UK viewers say that they prefer to watch live broadcasts of television together “because it’s good to know everyone is watching at the same time” [15]. People who remember a time before Netflix may recall the romantic experience of telephoning their partner and “watching a show together”.
Furthermore, there is a health imperative. Social isolation and loneliness have become major issues, with causal links suggesting loneliness can increase the likelihood of mortality by 26% [10]. With two fifths of UK seniors reporting that they are so lonely television is their main form of company, a more socially interactive television could benefit many people [28].
In this paper we will:
1. Explore the concept of seamless technology in the context of shared television watching, to deliver an experience that matches or even extends the actual in- person shared experience.
2. Draw inspiration from services in related domains that enable remote sharing.
3. Suggest how to measure the effectiveness of such a system.
Note, in this paper there is no distinction between watching “live” broadcasts, recorded programmes or streaming services. The focus is on watching shows together, and the technologies and services that could seamlessly connect and synchronises all of the constituent components
Critical Review
The “seamless” experience
When people watch together, everyone sits down, a programme is chosen with a remote control, and everyone passively engages with the show whilst enjoying sharing the experience in company. No further distracting interaction with technology is necessary, save maybe adjusting the volume. We describe such an experience as immersive andseamless.
Immersiveness, a term often used in the video game world, refers to users being sufficiently engaged, and distractions sufficiently removed, that they “lose themselves in the world of the game”[11]. Jennet et al. defined three levels of immersion: engagement, engrossment — where “the gamer’s emotions (are) directly affected by the game and the controls (become) “invisible” — and total immersion [11].
Weiser spoke of invisible tools being part of his definition of a seamless experience, “...by invisible, I mean that the tool does not intrude on your consciousness; you focus on the task, not the tool. Eyeglasses are a good tool – you look at the world, not the eyeglasses” [21].
From these definitions we can create criteria to help in the design of our system [12].
1. The user is not encumbered with bulky components that cause discomfort with prolonged use.
2. The user retains awareness of their real-world surroundings.
3. The visuals are of a high quality: high resolution, appropriate brightness, and comfortable frame rate.
4. The audio experience has high integrity: good quality, directional, and synchronised with the visuals.
5. The system components are interconnected in an ecosystem (they all “talk” to one another).
6. The system has a natural user interface, with intuitive, familiar and contextually appropriate controls.
7. The system facilitates a connection that is close to a natural conversation.
We can now look at services in related domains that enable sharing of different viewing experiences and judge them using these criteria.
Second Screening and voice chat
The use of social media whilst watching television has become a common practice [17]. Although this “second-screen” chatting may detract from the viewing experience, Weisz et al. reported that chat can “supplement poor material by making the experience of watching it more enjoyable”, and it has a significantly positive influence on social relationships [22].
Voice chat has become an integral component of collaborative video gaming, and via third- party applications such as Discord facilitate general conversation. Wadley et al. concluded that the experience of online gaming chat radically improved and made gaming considerably more social [20].
Geerts compared user’s experiences using voice and text chat applications while watching television and concluded that voice is considered more natural and direct, and by not having to interact with a device, it’s easier to maintain focus on the programme. However, younger users preferred text, due to their comfort with second-screening. They find it easier to dismiss or ignore messages, making it less distracting [7].
Virtual Reality
With Google’s £5 Cardboard and Daydream concept, VR has become mainstream and available to anyone with a reasonably high-end smartphone.
VR can create virtual venues for watching content. In 2015 Netflix launched a VR app letting viewers sit in a sofa in a virtual grand apartment and watch their shows [1]. The Plex app takes that further, including a social component enabling friends to join as avatars, complete with voice chat [23].
VR prevents the user retaining awareness of their real-world surroundings. An entirely synthetic environment, Mark Weiser described it as “...taking the glutinous approach to user interfaces design, (putting) the interface at the centre of attention, leaving the real word behind” [21]. However, a user living in a small apartment may find the distraction-free and more exciting VR environment preferable to their own.
Reaction Videos and Twitch
To address social isolation and loneliness, an individual could access a community of enthusiastic television fans if they had no-one else. “Reactors” film themselves reacting to things and upload the result to sites such as YouTube [31]. These videos now represent a sizeable part of the platform’s mainstream content, and particularly popular of the genre are videos of people watching television programmes. Filming reaction videos tends to be a hobby, but in 2011 Amazon-owned Twitch enabled video gamers to monetise the live- streaming of their game-playing, and with their current 2.2 million broadcasters this provides a business model for streamers. Viewers watch a real-time video of the broadcaster talk to camera, listen to commentary, and can interact with the gamer via text chat [5] [6] [16] [19].
Our system could seamlessly connect with a Twitch-like monetising platform, into our system, to make watching a programme with a favourite reactor as simple as watching a show with a family member. This could be a boon for socially isolated people.
Scenario
Alexander launches the “TV-Together” app on his phone. He notices a special anniversary episode of Doctor Who is on BBC1 that evening, but because he’s away on business the family won’t be together to watch their favourite show and he’ll be watching it alone. He uses the app to invite his wife and daughter to watch with him. Alexander’s dad Tom lives alone in Scotland, and is a fan of the show, so he gets invited too.
That evening, in his hotel room, Alexander sits on the bed facing the large LCD TV switched to BBC1. He puts on his AR glasses, and places a wearable speaker around his neck. He taps the smart ring on his right index finger with his thumb to activate it. Alexander’s phone recognises the ring activation tap combined with the proximity of the AR glasses, wearable speaker and timing of the TV-Together booking, so it automatically synchronises all of the components, ready to begin with the wave of a hand.
Alexander’s wife, Susan, and his daughter Claire, are already logged in, and appear as two avatars directly in front of him. His wife is a 3D caricature avatar, with a speech bubble next to it saying, “Popcorn at the ready?!”. She has work to do that evening so is sitting with her laptop and chooses to communicate via text chat.
Claire appears as an animated rabbit, her favourite avatar. She is wearing an Apple EarPod, so she can both watch television and hear her dad in her ear. Her avatar is animated by a combination of face scanning to mimic her expressions, and speech recognition makes it mouth her words.
Finally, Alexander’s dad Tom pops up in a video window to the right. Tom is deaf, so he has subtitles on his television screen, and he appears as a video, so he can sign to Alexander via the smartphone on a stand near his chair. Claire sees her grandad’s video on a window on her phone, so she signs “Hi Grandad!” to her phone. This pops up on Tom’s phone as a signing animated rabbit.
With his right hand, Alexander “picks up” the avatar of Susan. The smart ring correlates its place in 3D space to the virtual images created by the TV-Together application. A slow hand- closing gesture allows him to pick up and move virtual objects. A hand opening gesture places it somewhere else in the field of view. He puts Claire on the bottom left, and his dad in the top right. He moves his head around, and they stay fixed in their positions in the room (so they don’t follow him into the bathroom).
As the show starts, Claire squeals with delight. Thanks to the wearable speaker the sound seems to come straight from the rabbit’s mouth. Alexander moves his right hand to the rabbit and he makes a tapping gesture with his index finger that brings up some familiar settings icons and a volume control. He toggles this down a little to balance it with the volume of the television. The programme begins.
Alexander adores seeing his daughter enjoy the show he loved so much as a child, so seeing her avatar giggle at the jokes, hide behind her paws at the scary moments, and shed a tear when the Doctor regenerates makes him happy.
A speech bubble pops up next to Susan’s avatar. Although it would normally be converted via speech synthesis to a spoken sentence, Susan sent it as a private message, so it’s not seen or heard by the others. “Remind me what planet the Daleks come from, I don’t want to ask Claire, or she’ll laugh at me!”. “Skaro, LOL!” Alexander types as a response. “Thanks, you big geek!” appears as another speech bubble then fades away.
After an hour the show finishes, Susan and Tom sign out, and Alexander has a further 10- minute chat with his daughter – now a straight video stream of her - as she tells him about her favourite part of the show.
After a quick meal in the hotel bar, Alexander launches the TV-Together app again, and notices on the schedule that there’s a Sherlock marathon on iPlayer. An icon indicates that the ChiqueGeeks, a pair of well-known Doctor Who reactors, are planning to livestream a reaction video at 9pm so he signs up for it. The app casts the show onto the hotel room LCD screen and synchronises the streams. Alexander moves the ChiqueGeek reaction stream onto a picture frame on the wall next to the TV, then settles down for an evening binge watching his favourite show, with the banter of his favourite reactors. Alexander taps the ChiqueGeek avatar and taps on the keyboard symbol. A virtual keyboard pops up in front of him. His subscription doesn’t allow him to chat, but can can message them. He types in “No revealing who the murder is!”, hits send and the keyboard fades away. ChiqueGeek Amy winks at the camera and says out loud, “It’s elementary Alexander!”. Throughout the show, they provide humorous commentary and trivia that makes watching alone much more fun.
Proposed System
In this section we will look at a what components would be necessary to create “the system”. They would include:
Content Service
A content service would be required to enable multiple users to watch a programme in different locations with perfect synchronisation. This would have to happen automatically, so when the participants gather, the first person starts it for everyone.
BBC R&D have been developing iPlayer services with media synchronisation features. Devices can communicate with one another, share additional content and add accessibility features such as audio descriptions. [24].
Scheduling
A scheduling service would be required to combine iPlayer-style live broadcast and catch-up listings, and a social media-style application to connect to family and friends. The viewer could then set-up the shared experience either by finding out who is available at a given time and polling on what to watch, or selecting a programme and finding out who might be interested in watching it with them. The EU-funded 2-IMMERSE project used Facebook to co-ordinate co-viewers for their “Watching Theatre at Home” prototype [25]
Augmented Reality Glasses
Augmented reality (AR) glasses would need to be lightweight, enable a clear view of the viewer's home, and enable high-quality overlaid graphics and video. Magic Leap currently make the smallest headset, resembling a pair of diving goggles [33]. Intel’s recent Vaunt Bluetooth-connected smart glasses, look like ordinary prescription spectacles, and use a low power laser to project images directly onto the user’s retina. This form factor would likely be more acceptable to users, and closer to Weiser’s invisible eyeglasses metaphor [2] [21].
A final requirement would be built-in eye-tracking technology, to meet accessibility needs. This would enable glance-control of the interface, similar to the Tobii eye-tracking software natively supported in Windows 10 [32]. Tobii have recently trialed their eye-tracking technology in VR headset [8].
Sound
To create the impression of audio coming from specific places in a room, a multi-speaker system or surround headphones might be required. Headphones tend to tether the user, restrict head movements, and could block out the external sound. A shared experience requires the user to hear the programme audio in the natural ambience of their own room. But, for the illusion of sharing the room with other people, voices must also appear to come from within that room. Bose recently created a product called the SoundWear Companion. Paired via Bluetooth to an audio source, the SoundWear Companion, which looks like a small aeroplane neck pillow, creates a “cocoon of sound” around the user’s head [4] [9].
System
Guests would be invited, via an accompanying app, and programmes chosen, but to minimise taking the viewer out of the experience (and into a device full of potential distractions), a smart ring could facilitate gesture control, providing a more seamless experience while the show is playing. Gestures such as grabbing an avatar and positioning it in the field of view, or holding a hand over an avatar and opening and closing the hand to raise or lower volume could be in the periphery without the viewer taking focus off the screen.
In 2014 Apple patented such a device, rumoured to be a future controller for Apple TV [3] [29].
A voice interface would further enable control and interaction without the user having to take their eye off the television, and eye-tracking could also allow control and interaction options. Both of these would particularly benefit those with accessibility needs.
Evaluation Plan
An important stage of developing our system would be evaluating the design and the implementation, and testing the usability and functionality. We would recommend testing in user’s homes, the environmental validity of being comfortable in their own home and watching on their own television will be important to enjoyment of the experience.
Our goals are assessing system functionality across broad range of home environments and interconnected technologies and assessing ease-of-use of the interface. Our success criteria are having a system that is intuitive to use, by a wide range of users, of mixed technical abilities.
Following are three examples of evaluations we could carry out:
Heuristic Evaluation
In a heuristic evaluation, a system is reviewed by usability experts, compared to an accepted set of usability principles. One of the most well-known is Neilson’s ten interaction design principles, or broad “rule of thumb” guidelines [14]. Two examples are: “visibility of system status”, meaning that the user should always know what is going on, with continuous and timely feedback; and “consistency and standards”, with the same words, icons and symbols being used, and — in the case of a computer interface — on a consistent place on the screen”.
The advantage of this method is getting feedback early in the design process and it can be done inexpensively in the lab. A disadvantage is that it doesn’t always present options for fixing the problems.
A Think-Aloud Walkthrough
Participants in a think aloud test perform a set of tasks with a system, while continually verbalising their thoughts. Jakob Neilson describes it as “the single most valuable usability engineering method" [13].
In our think aloud we will create several scenarios, such as “select a television programme and invite your daughter to watch it with you”, and “contact your friend and ask him to choose a programme and invite you to watch with him”. Throughout the process, the user will describe such things as what they are doing, what they were expecting, what is working well for them and what is confusing them.
The benefit of this method is that it doesn’t take many users to quickly identify problems. The downside is that thinking aloud is an unnatural activity, and some people struggle to keep up the dialogue, or self-filter what they think isn’t important.
Eye-Tracking
As our system uses features built-in eye tracking, we can evaluate the effectiveness of the interfaces. Eye tracking measures where the eye is focussed and its motion. So, we can gain a better understanding of how quickly users find what feature they are looking for, and when they don’t find it, where they are looking.
The benefit is that eye-tracking is a powerful tool and generates data that can be visualised and compared. A disadvantage is that eye tracking can’t tell if a user consciously saw, or didn’t see, something. It doesn’t explain why users are looking at things, and it can’t be conducted with think aloud as the thinking processes clash [30].
Further Evaluation Work
Our success will not only be judged on the usability of the system, however. The ultimate greater issues are:
1. Is watching of a television programme with the system more enjoyable than watching television alone
2. Does the system facilitate companionship?
These would require further studies, which could involve such evaluation tools as a modifiedCompanion Scale, developed by Smock et al. to evaluate Facebook’s effectiveness at creating a sense of community [18].
Conclusion
This system offers the potential to seamlessly provide rich, shared experiences for families, separated by distance, to share the fun of watching their favourite TV shows together. Furthermore, it could provide a valuable service for people who are socially isolated. There is no single ecosystem of technologies available today that could deliver such a service, however many of the constituent components and services are. In evaluating the system, we would aim to demonstrate that we can create an immersive and enjoyable experience that is close to actually being in the same room as a group of family and friends and enjoying the shared experience of watching television together.

References
Edgar Alvarez. 2018. Netflix is taking a wait-and-see approach to virtual reality.Engadget. Retrieved May 2, 2018 from https://www.engadget.com/2018/03/07/netflix-virtual-reality-not-a-priority/
Dieter Bohn. 2018. Intel’s new Vaunt smart glasses actually look good. The Verge. Retrieved April 30, 2018 from https://www.theverge.com/2018/2/5/16966530/intel- vaunt-smart-glasses-announced-ar-video
Mikey Campbell. 2015. Apple invents ring-style wearable device with voice control, haptics, cameras and more. Apple Insider. Retrieved May 1, 2018 from https://appleinsider.com/articles/15/10/01/apple-invents-ring-style-wearable- device-with-voice-control-haptics-cameras-and-more
Ashley Carman. 2017. Bose made a speaker for your neck. The Verge. Retrieved May 1, 2018 from https://www.theverge.com/circuitbreaker/2017/9/21/16346846/bose- soundwear-companion-wearable-speaker-neck
Dan Dalton. 2017. These Gamers Make Money By Streaming Their Play. Buzzfeed. Retrieved April 27, 2018 from https://www.buzzfeed.com/danieldalton/these- people-are-making-a-career-out-of-gaming?utm_term=.qxNG0gpDZ#.bsGGwQMex
Matthew DiPietro. 2014. Twitch is 4th in Peak US Internet Traffic. Twitch Blog. Retrieved April 26, 2018 from https://blog.twitch.tv/twitch-is-4th-in-peak-us- internet-traffic-90b1295af358
David Geerts. 2006. Comparing voice chat and text chat in a communication tool for interactive television. Proceedings of the 4th Nordic conference on Human-computer interaction changing roles - NordiCHI ’06, October: 461–464. https://doi.org/10.1145/1182475.1182537
Devindra Hardawar. 2018. Tobii proves that eye tracking is VR’s next killer feature.Engadget website. Retrieved May 7, 2018 from https://www.engadget.com/2018/01/13/tobii-vr-eye-tracking/
Tyll Hertsens. 2017. Bose SoundWear Companion Speaker. InnerFidelity. Retrieved May 1, 2018 from https://www.innerfidelity.com/content/bose-soundwear- companion-speaker
Julianne Holt-Lunstad, Timothy B. Smith, Mark Baker, Tyler Harris, and David Stephenson. 2015. Loneliness and Social Isolation as Risk Factors for Mortality: A Meta-Analytic Review. Perspectives on Psychological Science 10, 2: 227–237. https://doi.org/10.1177/1745691614568352
Charlene Jennett, Anna L. Cox, Paul Cairns, Samira Dhoparee, Andrew Epps, Tim Tijs, and Alison Walton. 2008. Measuring and defining the experience of immersion in games. International Journal of Human Computer Studies 66, 9: 641–661. https://doi.org/10.1016/j.ijhcs.2008.04.004
Pat Lawlor. 2015. The new era of immersive experiences: making it possible.Qualcomm website. Retrieved May 2, 2018 from https://www.qualcomm.com/news/onq/2015/08/20/new-era-immersive- experiences-making-it-possible
Jakob Neilson. 2012. Thinking Aloud: The #1 Usability Tool. Nielson Norman website. Retrieved May 7, 2018 from https://www.nngroup.com/articles/thinking-aloud-the-
TPKF0 Page 10 of 12 FI
Watching Together: Creating seamless shared social experiences between remote users watching television.
1-usability-tool/
Jakob Nielsen and Rolf Molich. 1990. Heuristic Evaluation of user interfaces. CHI ’90
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, April:
249–256. https://doi.org/10.1145/97243.97281
Ofcom. 2017. The Communications Market - August 2017. August: 436. Retrieved
from https://www.ofcom.org.uk/research-and-data/multi-sector-research/cmr/cmr- 2017%0Ahttp://stakeholders.ofcom.org.uk/market-data-research/market- data/communications-market-reports/cmr13/
Sarah Perez. 2018. Twitch now has 27K+ Partners and 150K+ Affiliates making money from their videos. TechCrunch. Retrieved April 27, 2018 from https://techcrunch.com/2018/02/06/twitch-now-has-27k-partners-and-150k- affiliates-making-money-from-their-videos/
Jacob M Rigby, Duncan P Brumby, Sandy J.J. Gould, and Anna L Cox. 2017. Media Multitasking at Home. Proceedings of the 2017 ACM International Conference on Interactive Experiences for TV and Online Video: 3–10. https://doi.org/10.1145/3077548.3077560
Andrew D. Smock, Nicole B. Ellison, Cliff Lampe, and Donghee Yvette Wohn. 2011. Facebook as a toolkit: A uses and gratification approach to unbundling feature use.Computers in Human Behavior 27, 6: 2322–2329. https://doi.org/10.1016/j.chb.2011.07.011
Nick Statt. 2017. Twitch’s new affiliate program will let almost any streamer earn money. The Verge. Retrieved April 27, 2018 from https://www.theverge.com/2017/4/21/15385190/twitch-affiliate-program-ad- revenue-game-streaming-youtube
Greg Wadley, Marcus Carter, and Martin Gibbs. 2015. Voice in virtual worlds: The design, use, and influence of voice chat in online play. Human-Computer Interaction30, 3–4: 336–365. https://doi.org/10.1080/07370024.2014.987346
Mark Weiser. 1994. The World Is Not A Desktop. Interactions 1, 1.
J.D. Weisz, S. Kiesler, H. Zhang, Y. Ren, R.E. Kraut, and J.A. Konstan. 2007. Watching
together: Integrating text chat with video. Conference on Human Factors in Computing Systems - Proceedings: 877–886. https://doi.org/10.1145/1240624.1240756
Chris Welch. 2018. Now you can watch Plex in virtual reality with Google Daydream.The Verge website. Retrieved May 2, 2018 from https://www.theverge.com/circuitbreaker/2018/1/24/16927136/plex-vr-google- daydream-now-available-features
Companion Screens. BBC R&D. Retrieved April 30, 2018 from http://www.bbc.co.uk/rd/projects/companion-screens
Theatre at Home (Virtual Theatre Box). 2-IMMERSE website. Retrieved April 30, 2018 from https://2immerse.eu/theatre-at-home/
2009. Television from 1936 Is Britain’s oldest set. The Daily Telegraph. Retrieved April 26, 2018 from https://www.telegraph.co.uk/news/uknews/5865623/Television-from- 1936-Is-Britains-oldest-set.html
2011. A Short History of British Television. The National Science and Media Museum website. Retrieved April 26, 2018 from https://blog.scienceandmediamuseum.org.uk/chronology-british-television/
TPKF0 Page 11 of 12 FI
Watching Together: Creating seamless shared social experiences between remote users watching television.
2014. Loneliness and social isolation in the United Kingdom. The Campaign to End Loneliness. Retrieved January 14, 2018 from https://www.campaigntoendloneliness.org/loneliness-research/
2014. United States Patent Application: Devices and Methods for a Ring Computing Device. US Patent and Trademark Office. Retrieved May 1, 2018 from http://appft.uspto.gov/netacgi/nph- Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch- adv.html&r=46&p=1&f=G&l=50&d=PG01&S1=(345%2F173.CCLS.+AND+20151001.PD. )
2014. Eye Tracking. usability.gov website. Retrieved May 7, 2018 from https://www.usability.gov/how-to-and-tools/methods/eye-tracking.html
2016. Reaction Videos. Know Your Meme. Retrieved April 27, 2018 from http://knowyourmeme.com/memes/reaction-videos
2017. Tobii and Microsoft Collaborate to bring Eye Tracking Support in Windows 10.Tobii website. Retrieved May 1, 2018 from https://www.tobii.com/group/news- media/press-releases/2017/8/tobii-and-microsoft-collaborate-to-bring-eye-tracking- support-in-windows-10/
33. 2018. Magic Leap. Magic Leap website.
Back to Top