Last modified: Tue Sep 23 1997

Multimedia server taxonomy and evaluation

by Tobias Öbrink

A presentation of activities in WP7 for MERCI meeting at TELES in Berlin September 1997


Table of Contents


1. Introduction

Digital audio, video and computer supported cooperation tools are HYPE. VoD-, and Multimedia Server servers are sprouting up everywhere like weeds. There are several presentation languages and distributed programming languages waiting for streaming material to display in flashy shows on ITV, networked kiosks, home computers, you name it. Every platform vendor have presented their own high performance solution, and a lot of heavyweight players like f.ex. Microsoft, Netscape and Progressive Networks have launched their own products. The IETF and ITU-T works hard to keep up with the current avalanche-like development and tries to cooperate with eachother as well as with the commercial vendors to produce working standards.

During the last year I have been collecting information about research projects and commercial development efforts as well as the standardisation efforts related to networked digital multimedia in general, and MoD in particular. I have a lot of information, but no common structure in which to present it. When I look at all these announcements of new products and all new research papers on Multimedia Server systems, I wish I had some classification scheme to use in comparing the different solutions. Some general taxonomy that is not connected to a specific product or research prototype.

1.1 Workplan

We will develop a taxonomy over Multimedia Server-related elements to help in comparing different Multimedia Server solutions with regard to

  • Relevant Standards
  • Functionality
  • System Design
  • Performance
  • In parallell to this effort, I will continue to collect information about -, and test Multimedia servers to produce a survey over existing solutions using the above taxonomy. It will be an iterative process where the two parts will interactively complement each other.

    The survey will ofcourse relatively soon become outdated, but my hope is that the taxonomy will have a more lasting value.

    2. Multimedia server taxonomy

    The template defined in this section is filled with a union of features of all the evaluated Multimedia server systems and issues from literature (see the References section for a list over literature studied). This serves to motivate the parts of the template and to show what is possible. Used throughout this template is the terminology presented in Appendix A.

    2.1 System functionality

    Properites of - and functionality supported by the Multimedia server system.

    2.1.1 Main field of application

    The main purpose of the Multimedia server system. It could be lecture-on-demand, news-on-demand, movie-on-demand, conference recorder, network-based multimedia presentation, home VCR-like scheduled recording, and playback on demand (archiving), multimedia mail.

    2.1.2 Extension support

    How easy is it to extend the Multimedia server system with additional functionality.

    API's, modules for extending the servers functionality, interoperability, security and the ability to easily add new methods and parameters to protocols used.

    2.1.3 Call setup

    Call setup-related functionality supported by the Multimedia server system.

  • Capability negotiation
  • A mechanism for a Participant to determine the capabilities of Multimedia servers and other Participants. Capability negotiation allows user applications to present the appropriate user interface and Proxies to make decisions. For example, if seeking is not allowed, the user interface must be able to disallow moving a sliding position indicator. Stream transport negotiation is another important functionality.

    2.1.4 Browsing

  • Browse active Presentations.
  • Search and Browse recorded Sessions, Participants, Streams, Titles.
  • 2.1.5 Playback

    Which Playback-related functionality is supported by the Multimedia server system.

  • Playback types
  • Direct playback - A new session is created by the Multimedia-server for playback.
    Indirect playback - The Multimedia-server joins an already existing conference.

  • Playback-related user interface functions
  • The functionality for controlling play provided by the client application's user interface. F. ex

    2.1.6 Recording

    Which Recording functionality is supported by the Multimedia server system.

  • Record-related user interface functions
  • Browse active Sessions, Participants, Streams and "point-and-click" those you want to record.
    Start recording, stop recording, set time to start and end.

    2.1.7 Editing

    Which Editing functionality is supported by the Multimedia server system.

  • User interface functions
  • Edit content on Media-, and/or Title level.
    Filter/merge and other post-processing of Sessions, Participants, Streams and sub-packet level (block, sample. object)
    Orchestration (creating "artificial" time-stamps) use the "atomic" time steps of the level 0 index or creating new Streams based on processing on sub-packet level.
    Analysis tools to help editing and statistics.
    Editing quality and other aspects of Media coding.
    Eliminate initial silence in Streams, Participants, Sessions.
    Create layered encodings from non-layered originals.

    2.1.8 Admission control

    Does the Multimedia server system have some admission control scheme for access to the server. If the Control session transport is secured.

    2.2 System design

    What does the Multimedia server system architecture look like.

  • Client-Server
  • Client/Server with powerful, heavy servers with lots of resources, minimal client programs and playback tools.

  • Cluster
  • Distributed Multimedia server system with lots of small distributed servers using a common frontend or a lot of frontends using an internal communication channel.
    In a distributed Multimedia server system each Media Stream within a Presentation may reside on a different server. In this case the Client automatically establishes several concurrent control sessions with the different media servers. Media synchronization is performed at the transport level.

  • A Single point of contact
  • A single entrypoint for playback, recording, editing and admission control with a common interface for all functionality, preferably graphical and WWW-based.

    2.2.1 Client

    Handles all direct user interaction with the Multimedia server.

  • Supported playback tools
  • Which applications are compatible for use to playback the Streams delivered by the Multimedia server system.

    2.2.2 Server

    The Multimedia Server software handles content management, streaming, editing of stored content and much more. It can be state-less, delivering requested Streams like a HTTP-server or state-full to allow more intricate user interaction and spare the network from unnecessary Streams. The internal functionality of a Multimedia server includes

  • Multicast address allocation.
  • For direct playback to a multicast address the Multimedia server need to assign one or more unique multicast address/port pairs. Such a mechanism is helped by

  • Setting scope of Presentation.
  • The scope of a Presentation can be set either by using a multicast address in an administrated range or by limiting the Time-To-Live(TTL) of Streams. This scope can be computed by the Multimedia server or given by the Client.

  • Recording of Conferences.
  • The Multimedia server is invited to an existing Conference and the Client then establish a Control session.

  • Replay of Conferences.
  • The Multimedia server reconstruct the original Conference as far as possible.

    Reception timestamps. The Stream is reconstructed exactly as it was received by the Multimedia server.
    Original Sender timestamps. The Stream is reconstructed by using original Sender timestamps.
    Burst transmission. The Stream is reconstructed at the receiver using buffers.
    Indexed playback with "artificial" timestamps created at editing time.
  • Play "fake Conferences".

  • Post-edited Presentations made from original Conferences. As Replay of Conferences above, but with specially generated RTCP sender reports.

  • Play orchestrated Presentations.

  • Play Titles.

  • Fast forward and rewind.

  • Visual Fast forward and Rewind by creating new indexes, by increasing the data rate, or by having multiple copies of content coded for different presentation speeds.

  • Pause

  • When the Multimedia server Pause playing/recording of a Presentation or a single media Stream, it keeps the resources that was reserved when the playing/recording started. After resuming playing/recording, synchronisation of the media Streams in the Presentation must be maintained. A Pause request may contain a time when the Stream or Presentation is to be halted. If no time is given, Pausing starts as soon as possible. Pausing of an individual Stream means that the playout of the Stream is muted at the Sink. Pausing of an entire Presentation means that playout of all Streams in the Presentation is paused. A subsequent Play-request for the same Resource cancels the Pause. If a Pause request of a Presentation contained a time, then the playing will resume at that time.

  • End of Play/Record range.

  • After Playing/Recording a requested range, the Presentation is automatically Paused.

  • Live event Presentation.

  • While Participating in a live Presentation, the Client may be offered the ability to control capture devices such as cameras and microphones.

  • Reliable Multicast support.

  • To be able to correctly record a Session including tools using reliable multicast, the Multimedia-server must Participate actively in the Session. F.ex request retransmission of lost packets. This means that the Multimedia server must support the protocols used by the applications.

  • Redirection

  • If a Resource does not exist on the Multimedia server or it is too loaded, it may redirect the Client to another location.

    2.2.3 Communication

    The communication scheme of a Multimedia server system range from very simple request-reply schemes as in movie-on-demand, to intricate distributed clusters of Multimedia servers with support for real-time editing, Proxy servers and mirror servers.The Control session protocol can be state-less, like HTTP or state-full to allow more intricate user interaction and spare the network from unnecessary Streams. The following issues have been identified

  • Multicasted User Interaction

  • Every Participant can see the other's actions giving a sense of awareness.How to handle sequencing of Control Messages? Need total ordering of events.

  • Multicasted Presentation delivery

  • The case that we have a "moderator" and a bunch of "listeners" is ideal for multicasted Presentation delivery.

  • Grouping of Presentations

  • How to manage multicast addresses to take advantage of that a few users are looking at the same clip? The scalability gain appears close to the receivers/sender. High interactivity reduce scalability gain. Handling multicast addresses is (so far) more computationally expensive than ordinary unicast dito.

  • Reliable Multicast

  • Issues when delivering reliable multicast Streams

  • Limit scope of a Presentation

  • Network mechanisms for scope limitation

  • Inviting Participants

  • The Client supplies a list of Participants to receive the Presentation. These can be invited using the Session Invitation Protocol(SIP[12]) provided they have tools able to handle this protocol.In the case no support for SIP[12] can be depended upon, a Session Description Protocol packet must be sent by some other way to the Client. It's then up to the Client to contact the other Participants.

  • Invitation of a Media Server to a Conference

  • Either for recording or indirect playback. If the server is to participate in an existing multicast conference, the multicast address, port, encryption key and the Conference's globally unique identifier are given by the Conference's Session description provided by the Client. Upon accepting the invitation, the server will use a Control session protocol for further negotiation and subsequent control.

  • Control session transport

  • Control session Messages can be transmitted in several different ways:

    Messages can be either requests or responses. Both the Client and the Media server may issue requests if a persistent connection is used. For the other two, there is no reliable way of reaching the Client. It is also possible to "pipeline" requests on a persistent connection.

    Messages are equiped with sequence numbers to avoid reordering. The sequence number should include a timestamp to avoid Messages being confused between intermediate connectionless transmissions.

    To increase interaction responsiveness in LAN or other networks with small Round-trip times (RTT) one can use a connectionless transport in combination with a RTT estimation to compute optimized retransmission timeouts. For this purpose each connectionless transaction carries a transaction timestamp for each request transmitted. Repeated requests due to lack of acknowledgement gets a new timestamp. This separate timestamp is used instead of the Message sequence number to avoid ambiguities if a transport packet is lost.

  • Separation of Presentation and Control session

  • The Presentation and it's corresponding Control session are logically separated, even though they may be interleaved on the same transport connection(s).

  • Interleaved media Streams and Control Messages

  • Certain firewall designs and other circumstances may force a Multimedia server to interleave Control Messages and media Streams on the same TCP transport. In this case the Streams-part packets will be encapsulated in Messages with a special header.

  • Heterogeneous networks

  • Internet is the target. therefore support for layered coding and transmission is needed.

  • Security issues

  • Transport level security for control signaling and user authentication.

  • Firewall friendly

  • The Control session protocol should be readily handled by both application and transport-layer (SOCKS[25]) firewalls.

  • HTTP-friendly

  • Where sensible, the Session control protocol should re-use HTTP concepts, so that the existing infrastructure can be re-used. This infrastructure includes PICS (Platform for Internet Content Selection [26]) for associating labels with content.

  • Caching in Proxies

  • Unlike HTTP mostly Stream content may be cached, since Control session Messages is highly time- and context dependant. Some Messages, like Presentation descriptions, may also be cached.

    The Proxy must be able to ascertain as to whether it has an up-to-date copy of the Resource. If not it will forward the request-Message to the Origin server, and then pass the media Stream(s) while possibly making a local copy. The Proxy could also request the Stream(s) directly from the Origin server for transport over a reliable connection to conserve quality.

  • Control migration

  • If multiple Clients are allowed for a Presentation or if some floor control functionality (passing the remote) is supported.

    2.2.4 Storage strategy

    The way content is handled internally by the Multimedia server software. Multimedia content is often stored hierarchically by Presentation related information, Original Participant Sources and Titles related information and finally Stream related information and use indexes for arbitrary access of streams and other functionality as described below.

    Multicast addresses used if applicable
    Reference to Participant Sources and/or Titles related information
    Inter-participant synchronization information if applicable
    Presentation description stub

  • Data stored on Participant Sources consists of the following information
    Source address
    RTCP source info
    Inter-stream synchronizing information
    References to streams and index files

  • Data stored on Titles consists of the following information
    Name of Title
    Version
    Originator
    Short description
    Inter-stream synchronizing information
    References to streams and index files

  • Data stored on Streams consists of the following information
    Packet contents (Media)
    Headers if recorded Stream
    Time of arrival if recorded Stream
    Payload information, such as format, bit rate, sample rate, play speed, etc.

  • Indexing. Indexes are used locally in a Multimedia server to support functionality operating on Presentations or Streams, such as random access, fast forward, rewind, pause/resume, editing, etc. The use of indexes should be transparent to the ordinary consumer, but visible for content creators/editors. Supported number of index levels is rarely more than one. The following types of indexing have been identified.
    Indexes into streams (level 0 index) that contains
  • ˇ Reference to data (byte offset)

  • ˇ Arrival timestamp

  • ˇ Reference to Meta-data (F.ex a text, or HTML description)

  • Bookmarks into level 0 indexes
    Indexes into indexes.

  • Database Management. Different solutions are suitable for different Media types
    Raw partition - No limit in file size
    File system - Less complex software and more portable
    Relational DBMS - For index hierarchies and other non-Stream data.
    Object-oriented DBMS

    2.2.5 Hardware solution

    If the Multimedia server depends on certain hardware configuration(s) it should be described.

    2.3 Performance

    Dokumented and, if available, tested performance values and subjective comments.

    2.3.1 Network related

  • Max Number of Users
  • Max Number of Streams
  • Max output bandwidth

    2.3.2 Media related

  • Max Number of video Frames per second in a video Stream
  • Max Size in pixels of a video Frame

    2.3.3 System design related

  • Max Number of Users
  • Max Number of Streams
  • Max output bandwidth

    2.3.4 User friendliness

    Consumers are mostly humans, therefore we get the following constraints:

  • More than one Media at the same time. I.e. multiple different constraints on receiver buffering delay, loss and jitter.
    Media Buffering Delay Jitter sensitive Loss sensitive
    Audio Long Little Very
    Video Very short Little Not
    Other data Medium Not Very
  • Random access and recording. Highly interactive. I.e. roundtrip-time constraints on interactions.
  • Person-to-system. I.e. less roundtrip-time constraints on replay. People tend to be more patient when dealing with machines.
    With these constraints in mind we see that a good Multimedia-server should behave as following:

    1. Have a relatively long initial delay before replay when data is buffered at the receiver
    2. Prioritize the replay of Media at the receiver in the following order: Video, Audio, Data
    3. Prioritize the transfer and reception of Media in the following order: Interactions, Audio, Data, Video.
    4. Buffer a little Audio, and much Data before replay at the receiver.

    2.3.5 Administrator friendliness

  • Robust
  • Easy to install and setup
  • Easy to maintain and upgrade
  • Availability of support

    2.4 Standards

    Which standards does the Multimedia server system comply to. In this section is presented a few common standards mainly for use in communication between the different parts of the system.

    2.4.1 Coding and Compression

    Some Multimedia servers have limitations as to the encoding format of the Streams it plays.

    2.4.2 Call setup

    Call setup is the phase before a Conference is initiated, when Participants are invited, capabilities exchanged, billing issued, authentication taken care of, Presentation descriptions exchanged, transport channels set up and optionally Media parameters are set.

  • Indirect Call setup
    For the Client(s) to be able to invite the Multimedia server system to an existing Conference (for the purpose of recording the Conference or insertion of a Presentation) it need some sort of invitation protocol.
    The same sort of protocol may be used by the Client(s) to invite other Participants to receive a multicasted Presentation, and, perhaps, by the Multimedia server system to avoid playing of multiple identical Presentations to different Participants by grouping them together.
    The Client(s) may also obtain a Presentation description by using either SAP[13], RTSP[1] , email, HTTP, FTP, Telnet, shared file-system, paper or some other way. The information in this Presentation description may then be used to do a Direct Call setup without using any invitation protocol.
    SIP[12] - Use existing Location servers and User agents to find a Multimedia server and request that it joins the Conference by sharing the Conference's Session description. It is then up to the Client(s) to establish a Control session with the Multimedia server system.
    H.310[19], H.320[20], H.321[21], H.322[22], H.323[23 & 31], H.324[24] - Video telephony over different network types, includes functionality for Call setup.

  • Direct Call setup
    The Client(s) need to know the location of -, and establish a Control session with the Multimedia server system. It must also negotiate media Stream transport(s) and other capabilities as well as optional Media parameters by using the Session control protocol.
    RTSP[1] - Retrieve a Presentation description from-, set transport parameters and control the Multimedia server system.

    2.4.3 Presentation description

    The Presentation description is a view of a Presentation that is shared by the Client and the Multimedia server. It is distributed between the Client and the Multimedia server before the Presentation start and may become subsequently updated if new content becomes available.

    SDP[8] - A Session description format.

    2.4.4 Session Control

    The Client(s) control the Presentation via some sort of messaging or signalling protocol. The server must be able to tell the Client(s) about additional Media as it becomes available.

    RTSP[1] - Establishes and controls either a single or several time-synchronized Streams of Continuous media. The control Stream may be interleaved with the playback Stream(s) or separate. Acts as a "network remote control" for Multimedia servers.

    DSM-CC[7] - .

    2.4.5 Transport

    Protocols handling the delivery of Call setup-, and Control Messages, Streams and Presentation/Session descriptions.

  • Streams
    Streams are carried by IP multicast or unicast, both reliable and unreliable. RTP[2], UDP[14], TCP[15], RDP[16], SRM[30]. nt, et.c.

  • Presentation/Session description
    Presentation- and Session description files can be delivered by HTTP[5 & 6], FTP[19], SIP[12], SAP[13], RTSP[1], Telnet, email or some other way.

  • Call setup and Session control
    Call setup and Session control Messages are carried by IP multicast or unicast, both reliable and unreliable. UDP[14], TCP[15] or RDP[16].

  • Multicast address allocation
    When the Multimedia server is asked to deliver a Presentation on multicasted Streams the following protocols that can be helpful when assigning new multicast addresses.
    SAP[13] - See currently announced Conferences so the Multimedia server system can do a better multicast address allocation.
    RTP[2] - Multicast address/port pair collision recovery.

    2.4.6 User Interface

    The kind of user interface provided by client applications for the Multimedia server system.

  • Graphical User Interface (GUI)
    Tcl/Tk-based, HTML Forms or Java.

  • Other types of User Interface
    Line-based Telnet, serial or more structured text interfaces.

    2.4.7 Interoperability support

    IP[17], RTP[2], H.310[19], H.320[20], H.321[21], H.322[22], H.323[23 & 31], H.324[24] - Standards for ensuring interoperation over different networks, platforms, codecs.

    RTSP[1] - Standard for ensuring interoperation of Media server Control session Messages.

    2.4.8 Security

    Insecure network, end-points, proxies and mirrors calls for transport level security ( TLS [10] ) for the Control session and at least standard HTTP basic[6] - and digest authentication [11].

    2.4.9 Time representation

    A Multimedia server may have to keep track of multiple time axes depending on which functionality is supported. The Media time axis, the Presentation time axis and the "real-world" time axis.

  • Media time axis
    SMPTE relative timestamps - Society of Motion Picture & Television Engineers (SMPTE[32]) relative timestamps expresses time relative to the start of a clip and allows frame-level access accuracy in the Media time axis. The frame-part is only relevant for frame-oriented media. The time code has the format hours:minutes:seconds:frames.subframes.
    ISO 8601 - Timestamps in UTC. The time code has the format yyyymmddThhmmss.fractionZ.

  • Presentation time axis
    Normal Play Time (NPT[7]) - DSM-CC Normal Play Time allows access on tenths of a second accuracy in the Presentation time axis. The time code has the format hours:minutes:seconds.tenth or seconds.tenth.
    ISO 8601 - Timestamps in UTC. The time code has the format yyyymmddThhmmss.fractionZ.

  • "real-world" time axis
    ISO 8601 - Timestamps in UTC. The time code has the format yyyymmddThhmmss.fractionZ.
    Unix - Seconds after 00:00:00 UTC Jan 1, 1970.

    3. Multimedia server evaluation

    In this section I describe one evaluated Multimedia server systems using the pattern described in the previous section. In time I hope to be able to incorporate most of the Multimedia server systems listed under the Multimedia server survey Section.

    3.1 Sun MediaCenter 20

    The Sun MediaCenter server is a combination of standard server hardware and special software that is optimized for - and dedicated to the storage and delivery of video streams.

    KTH/Teleinformatics have got a Sun MediaCenter 20 for evaluation. It is equipped with three FastEthernet interfaces and is able to stream MPEG 1&2 over UDP/IP. The client run on Sun/Solaris and PC/Windows95.

    http://www.sun.com/products-n-solutions/hw/servers/smc_external.html

    3.2 System functionality

    The MediaCenter Client-Server system has the following functionality and properties.

    3.2.1 Main field of application

    Suitable for MPEG 1&2 Movie-delivery.

  • LAN Movie-on-Demand by using an external intermediary Switch acting as MUX/DMUX between Client and Server.
  • Pre-programmed Movie-delivery to a network.
  • Cable-TV (CATV) Near Video On Demand (NVOD) using external Client/Gateway and a D/A modulator.
  • Movie-proxy-cache using external proxy frontend. There's a 5 second initial delay for play-through.

    3.2.2 Extension support

    Uses ONC+TM RPC-based APIs for all user interaction and content loading. This makes it possible to implement new User interfaces or integrate the MediaCenter server into more complex systems.

    3.2.3 Call setup

    The Client sets destination address and choose speed, bit rate and format for each Stream directly through the RPC-API.

    3.2.4 Playback

    Playback is handled by a software entity called a "Player". All playback interaction is performed by issuing commands to the Player. It uses Direct Playback and has the following playback-related user interface functions

  • Search and Browse Titles.
  • Start play, kill play, set speed, set direction, set time to start and end.
  • Visual Fast forward and Rewind.
  • Set a minimal playback session scope.
  • Advertise additional Media as it becomes available (the Client or server updates the Presentation description in real-time).
  • Pause a Presentation(halt). Pause at a certain time.
  • Resume a Presentation.

    3.2.5 Recording

    No Recording possible. the content is loaded into the MediaCenter server using FTP or a proprietary TCP-based protocol.

    3.2.6 Editing

    It is possible to edit Titles by combining MPEG streams and "dead air" into a "play list". It is also possible to edit a Presentation's playlist during playback. It is not possible to edit MPEG streams stored on the MediaCenter server.

    3.2.7 Admission control

    Authentication supported by Secure RPC. Access to the server and to Players are based on the standard Solaris mechanism relying on /etc/nsswitch.conf. There are three access levels; read, control and admin.

    Table 1: Server and Player Access Levels
    Access level Server Player
    read Can get list of players, titles and states Can get access, persistence, playlist, connection and play status info
    control Unused Can control play speed, duration, position and start date
    admin Can create/delete players Can set access, persistence, playlist and connection

    3.3 System design

    A pure Client/Server solution consisting of a monolithic, specialized server and simple clients communicating over a stateless RPC-protocol.

    3.3.1 Client

    There are three supported Playback tools. All are proprietary and can only handle MPEG1 streams.

  • Java JDK 1.02 Client User interface for Pentium 166+ with Windows95/NT consists of an applet that allows a server administrator to manage client streams on a specified server and a video player applet called SunMediaCenterPlayer that uses
    Java class libraries and a Windows-based DLL to communicate with audio and video rendering software.
    Java class libraries to communicate with a Sun MediaCenter server using the Media Stream Manager (MSM) RPC protocol.
    An ActiveMovie source filter for the Sun MediaCenter server.

  • Java JDK 1.02 Client User interface for Sun UltraSPARC with the Fast Frame Buffer and Solaris 2.4+ consists of an applet that allows a server administrator to manage client streams on a specified server and a video player applet called SunMediaCenterPlayer that uses
    A program called MpegExpert (MPX) that decodes and displays MPEG-1 format video content.
    Java class libraries and a Solaris shared library to run the MPX program so that it displays inside an applet.
    Java class libraries to communicate with a Sun MediaCenter server using the Media Stream Manager (MSM) RPC protocol.

  • It is also possible to view Streams using the Sun ShowMeTV Receiver.

    3.3.2 Server

    A completely hardware dependant server software design with a modified Solaris kernel, modified network interface drivers dedicated to continuous media output, a Media File System (MFS) optimized for delivery of isochronous bit streams, a Media Stream Manager (MSM) that provides user access to the server and finally a Content Manager (CM) for loading/backup/restore content over LAN or to/from DAT tape.

    3.3.3 Communication

    The MediaCenter server use separate control- and output network interfaces. The Control network must be TCP/IP/FastEthernet. Output is supported for UDP/IP/FastEthernet, UDP/IP/AAL5 ATM, AAL5 ATM, and Fast/wide SCSI.

    The Client application controls the server through a RPC-based API. A typical Control session look like this:

    Table 2: A typical Control session
    Activity RPC Access required
    Establish Control session msmServerOpen None
    Get list of Titles msmTitleList Read Server
    Get attributes of a Title msmTitleGetStatus Read Server
    Create a Player msmPlayerLookup Admin Server, Read/Control/Admin Player
    Compute starting times of MPEG streams and dead air No connection to server needed No connection to server needed
    Build a playlist and instantiate msmPlayerSetPlaylist Admin Player
    Connect Player to destination msmPlayerSetConnect Admin Player
    Start Play msmPlayerPlay Control Player
    Check status of Play msmPlayerGetPlayStatus Read Player
    Delete Player msmPlayerDelete Admin Server
    Close Control session msmServerClose None

    3.3.4 Storage strategy

    The Media Stream Manager (MSM) handles operations on Presentation level. This means that a Presentation is synonymous with the MSM Player entity. The Presentation is completely determined by the current state of it's Player's playlist. The Playlist in turn consists of a sequence of Titles and dead air ordered in wall-clock time.

    The Content Manager handles the stored content according to the following hierarchy

  • Title. A named package of content consisting of
    At least one MPEG bitstream
    Optionally copies of the same MPEG bitstream(s), coded at different play speeds
    A Table Of Contents (TOC) file

  • TOC file. An ASCII file that contains:
    Title data consisting of name, version, MPEG format, short description
    A listing of all the MPEG bitstreams containing
  • ˇ A short description

  • ˇ The MPEG bitstream's play speed

  • ˇ The MPEG bitstream's bitrate

  • ˇ Paths to index -, and data files

  • ˇ Sizes of index -, and data files

  • for each of the MPEG streams

  • MPEG bitstream. One or more data files sequenced in time to form a Continuous Bit Rate bitstream.
    Index file. An ASCII file containing Splice Points, mappings between Normal Play Time and data file offset, that is used for jumping between different versions of an MPEG bitstream coded at different play speeds, and pausing/resuming Playback. There can only be at most one index file per data file.
    Data file. A file containing MPEG-encoded audio and/or video data.
    The stored MPEG bitsreams are handled by a Media File System (MFS). The MFS can handle up to 231 files, each file up to 264 - 1 bytes size. The MFS stores the files in a RAID-like fashion over multiple disks to ensure isochronous delivery and for robustness.

    The TOC- and index files are stored in the Unix File System (UFS) together with some additional internal data, and is therefore subject to Solaris limitations (See the section on Performance).

    3.3.5 Hardware solution

    The system evaluated consists of the following hardware

    A Model 712MP server with

  • two SM71 75-Mhz SPARC Module, each with one SuperSPARC-II Processor and 1-MB SuperCache,
  • 96-MB of Memory,
  • 2*2.1-GB Internal Fast SCSI-2 Disk
  • 644-MB Internal SunCD 2Plus
  • 1.44-MB Internal Floppy Disk Drive
  • 2* SunFastEthernet Adapter SBus Card,
  • 2*SBus Single-Ended Fast/Wide Intelligent SCSI-2 Host Adapter (SWIS/S),
    plus two 25.2-GB 5400-RPM MultiPack-6 RAID disks with 0.8m Cable and Sun MediaCenter - and Solaris Server software.

    3.4 Performance

    3.4.1 Network related

    Dependent of hardware configuration. The MediaCenter is guaranteed to deliver streams up to the limit of the output interface(s). In the Ethernet case load balancing is performed over all available Ethernet interface cards which means a receiving host must be reachable from all output interfaces.

    3.4.2 Media related

    Output is fully CCIR601 compliant according to the MPEG specification. MPEG streams range between 1.5 Mbps to 8 Mbps per stream. Every MPEG-bitstream must contain one and only one video stream.

    3.4.3 System design related

    The Solaris 2.4 operating system limits file sizes to max 2.1 GBytes, which may be encountered when loading content onto the MediaCenter server using ftp.

    MFS guarantees recovery from single-disk failures within 10 seconds and allows up to 231 files, each of which can be up to 264 - 1 bytes size. It also makes efficient use of disk bandwidth (50-90%), by striping the content over available disk-packs.

    The system prevents files currently being streamed from being removed.

    The system guarantees isochronous delivery of MPEG streams to the network from storage with defined limits to jitter and drift. Which these limits are, I haven't been able to find yet.

    The system also use UFS for storage of TOC and index files + 1500 bytes/MFS file stored + 1bytes/10 minutes of play time in a title (for playlist and other status?).

    3.4.4 User friendliness

    Extremely primitive user interface and somewhat unstructured manuals combined with a lot of limitations that are spread all over the documentation makes setting up and administrating this system an art.

    3.5 Standards

    The following formal or proprietary standards are used by the Sun MediaCenter server.

    3.5.1 Coding and Compression

    MPEG 1 (ISO/IEC 11172-1) or MPEG 2 (ISO/IEC 13818-1) audio and video.

    3.5.2 Call setup

    Direct Call setup using an ONC+TM RPC-based interface.

    3.5.3 Session Control

    The Client controls a Presentation through an ONC+TM RPC-based interface.

    3.5.4 Presentation description

    A Table of Contents (TOC) file specified using Abstract Syntax Notation One (ASN.1). The structure of the TOC file is described in the Administrator's Guide.

    3.5.5 Transport

  • Streams
    Isochronous MPEG 1&2 streams encapsulated in UDP/IP/FastEthernet, UDP/IP/AAL5 ATM, AAL5 ATM, and Fast/wide SCSI.

  • Presentation/Session description
    A list of available titles or the specific TOC file corresponding to a title can be obtained using an ONC+TM RPC-based interface over TCP/IP/FastEthernet.

  • Call setup and Session control
    Call setup and Session control through an ONC+TM RPC-based interface over TCP/IP/FastEthernet.

  • Multicast address allocation
    None available.

    3.5.6 User interface

    Java JDK 1.02 and native Solaris GUI.

    3.5.7 Interoperability support

    Can use IP transport to ensure interoperation over different networks and platforms. Use MPEG encapsulation on ATM standardized by ATM Forum dokument 95_0012R5.

    3.5.8 Security

    Supports the Solaris\xa8 implementation of Secure RPC.

    3.5.9 Time representation

  • Media time axis
    Called NPT, has the format microseconds in the range [0, 264 - 1] and is applied to a whole MPEG stream.

  • Presentation time axis
    Called "Playlist positions" and has the format seconds.nanoseconds.

  • "real-world" time axis
    Called "wall-clock time" and uses the Unix representation.

    4. Multimedia server survey

    A lot of information, but so far in a quite unstructured state and a lot of dangling loose ends can be found in Appendix B. Here's a listing of the Multimedia servers included so far:

  • SGI WebFORCE and MediaBase
  • XMovie project, University of Mannheim
  • Voyager project, Argonne National Laboratory, Illinois
  • MBone-VCR on demand Service, University of Mannheim
  • Oracle Video Client/Server
  • Sun MediaCenter
  • The RealMedia Model, Progressive Networks
  • StreamWorks, Xing Technologies
  • VXtreme client/server
  • The multicast Media-On-Demand/mMOD system, Luleå Tekniska Universitet
  • VivoActive from Vivo Software
  • IP/TV from Precept Software
  • NetShow 2.0 from Microsoft
  • VDOLive from VDONet Corporation
  • Vosaic
  • Video Conference Recorder (VCR) from UCL
  • MMCR

    5. References

    [1] "Real Time Streaming Protocol (RTSP)", H. Schulzrinne, A. Rao, R. Lanphier, Internet Engineering Task Force Request For Comments XXXX, 1997. Work in progress.
    [2] "RTP: A Transport Protocol for Real-Time Applications", H. Schulzrinne, S. Casner, R. Freederick, V. Jacobson, Internet Engineering Task Force Request For Comments 1889, Jan 1996.
    [3] "RTP Profile for Audio and Video Conferences with Minimal Control", H. Schulzrinne, Internet Engineering Task Force Request For Comments 1890, Jan 1996.
    [4] "A Video Conference Recorder - Design and Implementation", S Clayman, Dept. of Computer Science at University College London, Dec 1995.
    [5] "Hypertext Transfer Protocol -- HTTP/1.0", T. Berners-Lee, R. Fielding, UC Irvine, H. Frystyk, Internet Engineering Task Force Request For Comments 1945, May 1996.
    [6] "Hypertext Transfer Protocol -- HTTP/1.1", R. Fielding, UC Irvine, J. Gettys, J. Mogul, H. Frystyk, T. Berners-Lee, Internet Engineering Task Force Request For Comments 2068, Jan 1997.
    [7] "Information Technology - generic coding of moving pictures and associated audio information -part 6: Extension for digital storage media and control", Draft International Standard ISO 13818-6, International Organization for Standardization ISO/IEC JTC1/SC29/WG11, Nov 1995.
    [8] "SDP: Session Description Protocol", M. Handley, V. Jacobson, Internet Engineering Task Force Request For Comments XXXX 1997. Work in progress.
    [9] "mMOD: the Multicast Media-on-Demand system", Peter Parnes, Mattias Mattsson, Kåre Synnes, Dick Schefström, Submitted to the 7th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV'97), at the Applied Research Laboratory, Department of Computer Science, Washington University in St. Louis, Missouri May 1997.
    [10] "The TLS protocol", A. Freier, P. Karlton, P. Kocher, Internet Engineering Task Force Request For Comments XXXX 1997. Work in progress.
    [11] "An extension to HTTP: digest access authentication", J. Franks, P. Hallam-Baker, J. Hostetler, P. A. Luotonen, Internet Engineering Task Force Request For Comments 2069, Jan 1997.
    [12] "SIP: Session Initiation Protocol", M. Handley, H. Schulzrinne, E. Schooler, Internet Engineering Task Force Request For Comments XXXX, 1997. Work in progress.
    [13] "SAP: Session Announcement Protocol", M. Handley, Internet Engineering Task Force Request For Comments XXXX, 1997. Work in progress.
    [14] "User datagram protocol", J. Postel, Internet Engineering Task Force STD 6 Request For Comments 768, Aug 1980.
    [15] "Transmission control protocol", J. Postel, Internet Engineering Task Force STD 7 Request For Comments 793, Sep 1981.
    [16] "Version 2 of the reliable data protocol (RDP)", R. Hinden and C. Partridge, Internet Engineering Task Force Request For Comments 1151, Apr. 1990.
    [17] "Internet Protocol", J. Postel, Internet Engineering Task Force Request For Comments 791, Sep 1981.
    [18] "FILE TRANSFER PROTOCOL (FTP)", J. Postel, J. Reynolds, Internet Engineering Task Force Request For Comments 791, Oct1985.
    [19] "Recommendation H.310 - Broadband audiovisual communication systems and terminals", International Telecommunication Union - Telecommunication standardization sector (ITU-T), Geneva, Switzerland. Nov 1996.
    [20] "Recommendation H.320 - Narrow-band visual telephone systems and terminal equipment", International Telecommunication Union - Telecommunication standardization sector (ITU-T), Geneva, Switzerland. Mar 1996.
    [21] "Recommendation H.321 - Adaptation of H.320 visual telephone terminals to B-ISDN environments", International Telecommunication Union - Telecommunication standardization sector (ITU-T), Geneva, Switzerland. Mar 1996.
    [22] "Recommendation H.322 - Visual telephone systems and terminal equipment for local area networks which provide a guaranteed quality of service", International Telecommunication Union - Telecommunication standardization sector (ITU-T), Geneva, Switzerland. Mar 1996.
    [23] "Recommendation H.323 - Visual telephone systems and equipment for local area networks which provide a non-guaranteed quality of service", International Telecommunication Union - Telecommunication standardization sector (ITU-T), Geneva, Switzerland. Nov 1996.
    [24] "Recommendation H.324 - Terminal for low bit rate Multimedia Communication", International Telecommunication Union - Telecommunication standardization sector (ITU-T), Geneva, Switzerland. Mar 1996.
    [25] "GSS-API authentication method for SOCKS version 5", P. McMahon, Internet Engineering Task Force Request For Comments 1961, Jun 1996.
    [26] "Rating Services and Rating Systems (and Their Machine Readable Descriptions)", J. Miller, P. Resnick, and D. Singer, REC-PICS-services-961031, Worldwide Web Consortium, Oct. 1996.
    [27] "Uniform Resource Locators (URL)", T. Berners-Lee, L. Masinter, M. McCahill, Internet Engineering Task Force Request For Comments 1738, Dec 1994.
    [28] "Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as Used in the World-Wide Web", T. Berners-Lee, Internet Engineering Task Force Request For Comments 1630, Jun 1994.
    [29] "URN Syntax", R. Moats, Internet Engineering Task Force Request For Comments 2141, Jan 1997.
    [30] "A Reliable Multicast Framework for Light-weight Sessions and Application Level Frameing", S. Floyd, V. Jacobson, C. Liu, S. McCanne and L. Zhang, IEEE/ACM Transactions on Networking, Nov 1995.
    [31] "Usage of H.323 on the Internet", P. Lantz, Internet Engineering Task Force Request For Comments XXXX, Feb 1997. Work in progress.
    [32] "Television, Audio and Film - Time and Control Code", Society of Motion Picture & Television Engineers Standard SMPTE 12M-1995, 1995.
    [33] "RTP Payload Format for MPEG1/MPEG2 Video", D. Hoffman, G. Fernando, V. Goyal, Internet Engineering Task Force Request For Comments 2038, Oct 1996.
    [34] "The HyTime Hypermedia/Time-based Document Structuring Language", S. Newcomb, N. Kipp, V. Newcomb, Communications of the ACM vol.34 No.11, Nov 1991.

    A Terminology

    The terminology used in this paper is mainly based on the terminology used in RTSP[1], but also includes terms defined in RTP[2 & 3], S. Clayman's paper on VCR[4], HTTP[5 & 6], DSM-CC[7], SDP[8], mMOD[9], SIP[12]

    A.1 Client

    The Participant initiating a Control session to a Multimedia server.

    A.2 Cache

    A program's local store of Media and the subsystem that controls its Media storage, retrieval, and editing. A Cache stores Cachable media in order to reduce the response time and network bandwidth consumption on future, equivalent requests. Any Client or server may include a Cache, though a Cache cannot be used by a server that is acting as a Tunnel [6].

    A.3 Cachable media

    A Media is Cachable if a Cache is allowed to store a copy of the Media for use in answering subsequent requests. Even if a Resource is cachable, there may be additional constraints on whether a Cache can use the cached copy for a particular request [6].

    A.4 Call

    A private Conference between exactly two Participants.

    A.5 Conference

    A multiparty, multimedia Presentation (multi is >= 1) [1 & 8]. A Conference is uniquely identified by a Conference ID [1].

    A.6 Connection

    A transport layer virtual circuit established between two programs for the purpose of communication [1 & 6].

    A.7 Content negotiation

    The mechanism for selecting the appropriate Representation when servicing a session control Message[6].

    A.8 Continuous media

    Data where there is a timing relationship between Source and Sink, that is, the Sink must reproduce the timing relationshop that existed at the Source. The most common examples of Continuous media are audio and motion video. Continuous media can be realtime (interactive Media), where there is a "tight" timing relationship between Source and Sink, or streaming (playback Media), where the relationship is less strict [1].

    A.9 Control session

    The Client(s) control the Multimedia server system via some sort of messaging or signalling protocol. The lifetime of a Control session is independent of the lifetime of any Presentation or Conference that the Client and Multimedia server participate in. It is also independent of the lifetime of any transport connection used. The Control session between a certain Client and a certain Multimedia server system is uniquely identified by a Control session ID [1].

    A.10 Entity

    The information transferred as the payload of a Session control message. An Entity consists of metainformation in the form of Entity-header fields and content in the form of an Entity-body [1 & 6].

    A.11 Gateway

    A server which acts as an intermediary for some other server. Unlike a proxy, a gateway receives Messages as if it were the origin server for the Resource; the Client may not be aware that it is communicating with a Gateway [6].

    A.12 Initiator

    The Participant initiating a Conference Invitation. Note that the calling Participant does not have to be the same as the one creating a Conference [12].

    A.13 Invitation

    A request sent to attempt to contact a user (or service) to request that they participate in a Session [12].

    A.14 Invitee

    The person or service that the calling party is trying to invite to a conference. Also called Invited User [12].

    A.15 Lecture

    A Conference with one Participant acting as Source and at least one Participant acting as Sink.

    A.16 Location server

    A program that can be requested to provide one or more possible locations for a user or service without contacting that user or service directly [12].

    A.17 Media

    An digital encoding of some information, either digitized or computer-generated.

    A.18 Media parameter

    Parameter specific to a Media type that may be changed while, or prior to, the Stream is played [1].

    A.19 Media server

    The network entity providing playback or recording services for one or more media Streams. Different Media Streams within a Presentation may originate from different Media servers. Also called Multimedia server [1].

    A.20 Media type

    The Media Types are

  • Text
  • Graphics
  • Audio
  • Video

    A.21 Message

    The basic unit of Session control (RTSP and HTTP) communication, consisting of a structured sequence of octets matching the syntax defined in [1, 5 & 6] and transmitted via a connection or a connectionless protocol.

    A.22 Origin server

    The server on which a given resource resides or is to be created [6].

    A.23 Participant

    Participants are members of Conferences. A participant may be a machine, e.g., a Media record or playback server [1].

    A.24 PICS

    PICS is an infrastructure for associating labels (metadata) with Internet content. It was originally designed to help parents and teachers control what children access on the Internet, but it also facilitates other uses for labels, including code signing, privacy, and intellectual property rights management. PICS is a platform on which other rating services and filtering software has been built [26].

    A.25 Presentation

    A set of one or more Streams which the Media server allows the Client to control together. A presentation has a single time axis for all Streams belonging to it. Presentations are defined by Presentation descriptions. A movie or live concert consisting of one or more audio and video streams is an example of a Presentation. Also called Session for a live Presentation(see Session) [1].

    A.26 Presentation description

    A Presentation description is a type of Session description for describing Presentations. It conveys sufficient information to discover and participate in a Presentation as well as optional information about the content of the Presentation [1].

    A.27 Proxy

    An intermediary program which acts as both a server and a client for the purpose of sending session control Message on behalf of other clients. Messages are serviced internally or by passing them on, with possible translation, to other servers [6].

    A.28 Representation

    An Entity included with a Message that is subject to content negotiation. There may exist multiple Representations associated with a particular Message status [6].

    A.29 Resource

    A network data object or service that can be identified by a URI [6]. F.ex media Streams, Presentations and Presentation descriptions are Resources.

    A.30 Session

    One or more Participants participating in a Conference, Lecture or Call. A Session including Streams of more than one Media type is also called a Multimedia session[8].

    A.31 Session announcement

    A mechanism by which a Session description is conveyed to users in a pro-active fashion, i.e., the Session description was not explicitly requested by the user. Also called Session advertisement [8].

    A.32 Session description

    A Session description contains information about one or more media Streams within a Session, such as the set of encodings, network addresses and information about the content. It conveys sufficient information to discover and participate in a multimedia Session [8].

    A.33 Sink

    A sink for a specific Stream. There can be several Participants receiving the same Stream on the same host.

    A.34 Source

    The host from which a specific Stream origins. There can be several Participants sending from the same Source.

    A.35 Stream

    A "stream" of packets containing Media of one Media type. A Stream can be one-to-one. one-to-many, many-to-one or many-to-many. Can be delivered in real-time or streamed and delivered by an unreliable or reliable transport. Also called Media stream [7].

    A.36 Title

    A package consisting of an orchestrated set of Media content, identified by a single title and intended for delivery as an atomic unit. Typically protectable by copyright as a work of art.

    A.37 Tunnel

    An intermediary program which is acting as a blind relay between two connections. Once active, a tunnel is not considered a party to the communication. The tunnel ceases to exist when both ends of the relayed connections are closed [6].

    A.38 URC

    Uniform Resource Citation, or Uniform Resource Characteristics. A set of attribute/value pairs describing a Resource. Some of the values may be URIs of various kinds. Others may include, for example, authorship, publisher, datatype, date, copyright status and shoe size. Not normally discussed as a short string, but a set of fields and values with some defined free formatting. W3C has developed the PICS technology for resource description [26].

    A.39 URI

    Uniform Resource Identifier. URIs are the union of URLs (Uniform Resource Locators), URNs (Uniform Resource Names), URCs (UniformResource Charactersitics). These are all supposed to work together in a general scheme for naming, describing, and retrieving Resources on the Internet [28].

    A.40 URL

    Uniform Resource Locator. A compact string representation for a Resource available via the Internet. URLs are used to locate Resources, by providing an abstract identification of the Resource location. Having located a Resource, a system may perform a variety of operations on the Resource, f. ex. access, update, replace, find attributes, et.c of a Resource [27].

    A.41 URN

    Uniform Resource Name. A single, persistent, location-independent, name of a Resource. Needs a name resolution mechanism to map the name to the Resource's URL(s) and URC(s) [29].

    A.42 User agent

    A program which contacts an Invitee to inform them of an Invitation, and to return a reply [12].

    A.43 Variant

    A Resource may have one, or more than one, Representation(s) associated with it at any given instant. Each of these representations is termed a Variant [6].

    B Survey

    B.1 SGI WebFORCE and MediaBase

    Integrated Web-media servers that deliver interactive, real-time, high-quality MPEG1 and MPEG2 video and audio streams to Web clients via IP and ATM networks.

  • Content Management
  • Scalable Media Delivery Services (the video pump)
  • Web Integration
  • High-Performance Networking
  • LAN Multicasting
  • Storage Management
  • Operations and Management Tools
  • Cosmo(TM) StreamPlayer Video Client
    28.8Kbps to 8Mbps bandwidth for Internet and intranet applications Cosmo MediaBase 2.0 will run on the Origin(TM) systems. http://www.sgi.com/Products/WebFORCE/Products/Mediabase/

    B.2 XMovie project, University of Mannheim

    See http://www.informatik.uni-mannheim.de/~keller/XMovie/XMovie.html

    A CMClient (Continuous Media client) in the Xserver-machine using MTP (Movie Transmission Protocol) on TCP and UDP to communicate with a Video server that's located somewhere else. The X11 protocol is extended to handle continuous media streams. The CMClient uses a shared memory buffer and the extended X11 to c ommunicate with the Xserver. The CMClient uses AF (Audio File) protocol and the extended X11 to communicate with the AudioFile server that also resides in the Xserver-machine. The playback application uses MCAM (Movie Control, Access and Management) protocol to communicate with the CMClient.

    The Xserver doesn't have to reside in the same machine as the playback application and the X11 need less bandwidth for CM. This is a good solution for viewing video streams on dumb Xterminals (No disk, very little memory).

    Of course neither the local network nor the Xserver machine can cope with more than 2 - 3 CMClients at a time and a few more player applications (watching the same stream), because all decoding have to be done in the CMClient, the Xserver have to mediate the full video frames to the playback application. And the AudioFile server?

    But can this extended X11 support sharing of X applications?

    B.3 Voyager project, Argonne National Laboratory, Illinois

    The Voyager system has a distributed client-server architecture, where the server side consists of a set of cgi-scripts for session browsing, recording, and playback; a pair of lowlevel daemons for record and playback; a Perl-based architecture for gluing the layers together (using nPerl for interdaemon communication (1)), and a relational database to act as a central repository of informationfor the system as a whole. The design goal is to serve large numbers of high-bandwidth streams (nominal 5Mbps or more) both in record and playback mode.

    We use the IBM TigerShark multimedia filesystem on the server machine, in order to do striping across multiple disk server nodes in the SP2 we're using as a server.

    On the client side we just use the mbone tools -- vic/vat or the Precept tools.

    We demonstrated an earlier version of Voyager at Supercomputing '95 -- see http://www.iway.org/video/index.html for a little info; information on the current version is available at http://voyager.mcs.anl.gov/Voyager/.

    (1) Nexus/Perl -- see http://www.mcs.anl.gov/nexus/nperl/

    B.4 MBone-VCR on demand Service, University of Mannheim

    A while ago I implemented one of the recording and playback applications on the MBone, the MBone-VCR, but unfortunately it is a little outdated since I didn't have enouguh time to look after it after I left ICSI (where I developed it). Anyhow, some ideas were still in my mind and together with two students in Mannheim we now could start a new project a couple of weeks ago that we called: "MBone-VCR on demand Service"

    Well, we just started, so there are lots of things we have to investigate further, but since John brought the issue up, I thought this might be a good point to throw our ideas in the pot as well...

    A brief overview of our project goals so far:

  • develop a distributed client-server architecture where MBone-VCR clients request recordings andplaybacks from MBone-VCR server.
  • clients will be java-applets so they can be viewed and run with any java-capable browser on any platform. The clients take care to launch the appropriae applications to view the recordings. The information which the aproriate appplication comes from the server. The interaction between clients and servers will be via remote objects.
  • server will be partially implemented as java-applications (not applets!) and partially using C++. The interface to the client will be implemented in java (so we can use the same RO-stack), time critical issues like retrieving RTP- packets from the net and storing them on a vcr-file will be implemented in C++ and bound to the java application as shared library. The server will implement an interface to sdap in order to be able to receive session announcments (which the server will offer the clients as possible candidates that they can record) and to announce playbacks requested from clients.
  • define a protocol between clients and servers that will provide:
    session control and management e.g.:
  • ˇ listing of available session announcments

  • ˇ listing of available recordings

  • ˇ scheduling of recordings

  • ˇ scheduling of playbacks

  • ˇ change parameters of a session (e.g. lower the ttl)

  • stream control and managment e.g.
  • ˇ play, stop, pause, record, ff, rew, set index, jump to next index, ...

  • ˇ events like "end of recording", "sync" or "update timing"

  • floor control
  • ˇ allow only one client to control the session while multiple clients may receive it.

  • ˇ allow passing of the floor control token

  • The file format should be:
  • ˇ media independant

  • ˇ self describing

  • ˇ interchangable

  • ˇ efficient

  • and optionally support
  • ˇ fast random access

  • ˇ indexing

  • ˇ editing

  • There are a lot more things to mention but I think you get the idea and we have enough to talk and discuss about. We are at the very beginning of the project and would be happy to have lively discussion and new ideas that we may include in the architecture.

    Contact:

    Wieland Holfelder
    University of Mannheim Praktische Informatik IV
    Email: whd@pi4.informatik.uni-mannheim.de L 15,16
    Fax : +49-621-292-5745 68131 Mannheim
    Phone: +49-621-292-3300 Germany
    http://www.informatik.uni-mannheim.de/~whd

    B.5 Oracle Video Client/Server

    This is a short summary from the Oracle Video server manual. We haven't been able to verify the information in the manual yet because of lack of funds and personell to do the installation and run the tests.

    B.5.1 Oracle Video Client

    Platform:

    PC/Windows 3.11
    API:

    16bit OLE API, for ex. Visual Basic 4.0.
    Format:

    MPEG-1
    Comment:

    Oracle Media Net uses UDP/IP, so let's imp a new client

    B.5.2 Oracle Video Server

    Platform:

    Sun, HP/Unix
    API:

    System Services API, C
    Formats:

    MPEG-1 implemented. Supports storing of Quicktime, MPEG-1 & 2, TrueMotion, DVI, JPEG, et.c. Can be expanded through System Services API, C
    Features:

    Multi-user. Striping & RAID Disk access
    Comments:

    I like the disk handling. The API is bearable. Performance not affected by disk fragmentation.

    B.5.3 Oracle Media Net

    Overview:

    API:

    Media Net RPC, Media Net XDR, Semaphores, Media Net Transport Layer API, System Services API, C
    Protokoll:

    UDP/IP. Possible to expand to RTP/UDP/IP for ex.
    Features:

    Oracle Media Net Logger Process
    Oracle Media Net Name Server (Servers and clients have their own unique name space built on top of underlying transport layer(UDP))
    Oracle Media Net Address Server (Maps Media Net addresses to corresponding
    physical addresses (UDP == physical address?))
    Oracle Media Net Process Server (Typical portmap-process)
    Connection Service (???)

    B.5.4 Interactive Applications Objects SDK Object Reference

    A typical convenience functions library in C. Cannot see any objects though.

    B.5.5 Other comments and questions

    1) Total Bandwidth demand?

    bw_demand = #streams * average_bitrate * 1.2
    where the "1.2" stands for adm. loss and other unforeseen loss
    2) FDDI network interface

    30 streams @ 2 Mbps
    40 streams @ 1.5 Mbps
    3) Max output from Sparc center 1000

    4st FDDI-cards:
    120 streams @ 2 Mbps => 240 Mbps
    bw_demand = 240 * 1.2 = 288 Mbps
    Needs approx. 256 MB RAM?
    4) Media clips and all that

    Is supported.
    5) Videopump

    Max # video streams = 20 (MPEG version)
    Max # video streams from Oracle Video Server, an implementation of
    Oracle Media Server I think. We probably can write own media pump
    programs for the Media Server that can encode/transmit other than
    MPEG over UDP
    6) RTP MPEG-1 & 2 payloads defined

    7) RTP JPEG payloads defined

    JPEG also supported in the System Services API of the Media Server according to the documentation.
    8) RTP over UDP/IP

    Because RTP has
    Sequence numbering: Ignore out-of-date packets and copies
    RTP routers terminates network loops
    Conference control and management (Also supported by the OMN unfortunately)
    Time stamps: Connection quality and delay
    Data rate- and routing negotiation possible via RTCP
    Implementations exists
    Can use System Services API to encapsulate any format in RTP, but how handle RTCP messaging?
    9) Do students have to run their own Mrouted at home to be able to receive?

    10) How many ISDN BRI or Modem 28.8 connections on 240 Mbps?

    11) Media Net RPC

    Is it implemented or not?
    run-time lib
    interface def. language
    error handling
    RPC compiler (Not yet?)
    12) Oracle Media Server usability

    Oracle Video Server and Oracle Media Net.
    Can use on LAN for a few streams, but we have to modify to other format and bitrates.
    For Internet use and ISDN/Modem users we need alot of modifications.
    Current Server user- and programmer interface is horrible.
    Disk- and request handling in the server is good.

    B.6 Sun MediaCenter

    KTH/Teleinformatics have got a Sun MediaCenter 20 for evaluation. This is another video server platform equipped with three * 100Mbps Ethernet and ability to stream MPEG1&2 over UDP/IP. The client run on Sun/Solaris and PC/Windows95.

    http://www.sun.com/products-n-solutions/hw/servers/smc_external.html

    B.7 The RealMedia Model, Progressive Networks

    We have six elements in our current model: Servers, Splitters, License Servers, System Managers, Players, and Encoders. These elements form a tree heirarchy for any given stream. Each stream has an identifier (URL) which identifies the server originating it and the object in that server's namespace (the RealAudio file). Both live and on-demand streams (the former originating from live encoders and the latter originating from files) appear in a single heirarchical namespace maintained by the server.

    The protocol used, RTSP, is currently under standardisation in the IETF MMUSIC WG.

    See also http://www.real.com/products/server/index.html for a list over Streaming Media Servers.

    Servers accept connections from Players and other Splitters. Each such connection takes the form of a TCP connection. Player connections support the semantics needed for random access for on-demand (stop, start, seek, pause and resume) and simple stop/start for live streams. The TCP connection remains in place until the player terminates the connection, at which point a single packet of quality of service information is forwarded to the server over the TCP link before the player disconnects.

    The delivery of data from the Server to the Player can be in three modes:

  • UDP unicast. To cope with packet loss players also support UDP-resend requests for UDP unicast connections
  • TCP unicast and
  • UDP multicast.
    The type of connection is a function of the player, which chooses TCP or UDP and the server, which chooses multicast or unicast for UDP connections.

  • In the case of Multicast, the Server communicates the UDP port and Class D address to listen on to the Player at startup.
  • In the case of Unicast the Server communicates the UDP or TCP port (which is always the control port) on the Server's Class C IP address.
    In the case of Splitters the connection is always TCP and the control channel supports a protocol for license management, whereby the Splitter can accept connections and cause the stream license of the originating server to be decremented. This latter protocol is asynchronous using a debit/credit system to prevent synchronous license requests from killing the server. There is only one Server/Splitter connection per stream being split by a given Splitter. The TCP delivery from Server to Splitter assures that the stream is pure, at the cost of buffering in the Splitter (the depth of the buffer being a configuration parameter of the Splitter).

    Since Servers can be configured as multiple processes or Clusters of different hosts, the Player-Server connection also supports "redirection", whereby the Player is silently redirected to re-connect to another host during connection to achieve load balancing. In the future we plan to use redirection based on other heuristics such as redirecting players to the topologically closest Splitter or Cluster member. Splitters can also be clustered.

    Another aspect of our Server-Player protocol is "bandwidth negotiation". The Player communicates its capabilities in terms of bandwidth and codecs to the Server which selects a "profile" for delivery to the Player which matches this, Currently profiles comprise single audio streams but in the future we plan to support sessions with multiple streams, grouped into profiles. Also the protocol supports the selection of streams by the Player on a dynamic basis. For example, the user could choose to switch between cameras on a multiple video session. In our more complete model the player should be able to respond dynamically to a new stream arriving mid-session and route the data to the appropriate Player module for play-back.

    The Splitter is actually a function of the Server, but a Server can be configured for Splitting alone. Splitters look like Servers to players, the only difference being that they are addressed as proxies, either within the Player by static configuration (the default Splitter) or explicitly in the URL. Splitter URL's take the form pnm://splitter/pnm://server/filename... The Player strips off the pnm://splitter/ piece and passes the remaining URL to that Splitter over a standard Player-Server connection. Splitters can be nested in URL's more than one deep, in which case each level removes the splitter suffix and forwards the remaining URL to that Splitter. Splitters too can be configured with default proxies to use as their Splitters.

    Splitters, like Servers, and although they always recieve their inputs via TCP, can deliver data to Players via TCP or UDP (unicast and multicast). There are many elements of splitting which need further work, specifically: caching on-demand content, roll-up statistics to Servers and the forwarding of user input to the Servers (e.g. interactive user choices); protocols for finding Splitters by Players based on network topology and the insertion of content as data travels through the Splitter heirarchy; and, setup of Multicast over the Internet. Another area for future work is Player Authentication. Today we support what we call validation, whereby the Server or Splitter and Player ensure that they are bona-fide RealAudio components. Authentication is the function of allowing access based on Player identity, tickets or other criteria.

    Typically Servers will be configured to reserve some or all other their connections for Splitters and can explicitly setup the IP addresses of those Splitters. Splitters can be configured to explicitly allow Splitting of certain Servers and streams. Player connections too can be confined to certain IP addresses or subnets.

    License Serving is another function of the Server. Servers can get their licenses (for a number of streams) based on a static key in their local configuration file, or from another "License Server" at startup. In fact the two capabilities can work together, with a fixed local license augmented from a License Server. Splitting functions always depend upon the dynamic license protocol described above and are in addition to any "Server License" which controls the origination of content at the server itself.

    System Managers are entities which connect to Servers and Splitters and allow the configuration to be dynamically viewed and modified, they use a different protocol, which is a dialect of RBP (as are all our protocols although the document you have only describes the Server/Player piece). Our direction currently is the Server also support SNMP. The connection of a System Manager to a Server is controlled in two ways: a password and the ability of a Server to specify explicit IP addresses from which it will accept System Manager connections.

    Players are pretty self-evident in their function, but it is important to understand that we see the TCP connection between the Player and the orginating point (Server or Splitter) of its session as very important to allow the originators of content to gain explicit information about who is listening. Looking down from the origination point the System Management functions should allow a Server operator to see exactly who is connected to all sessions originating from themselves. At any point in the Splitter/Server Heirarchy the owner of a Splitter should be able to see all traffic it is passing and who, downstream is connected. It is also explicit in our model that connections are initiated upstream, although this could change.

    Finally, Encoders. Encoders take raw input, compress it and pass it over a TCP connection to a Server. The encoder supplies a password to the Server (and may have to be associated with explicit IP addresses setup on the Server) and also provides a virtual filename and codec type for the stream. Within the Server mutiple streams with the same filename, but different sources are combined into a session with that filename, as seen by Players or Splitters. Today the model does not fully support profiles within Live Sessions but it should.

    B.7.1 Progressive Networks RealPlayer

    Stereo audio at 28.8, near-CD quality at higher bitrates, AM-quality audio at 14.4. Newscast-quality video at 28.8 and full-motion at higher bitrates.

    Player currently available for Windows95/NT and Macintosh PowerPC. In the future also available for

    Linux 1.2.x with an OSS (Voxware) 3.0 sound driver

    Linux 2.0.x with an OSS 3.5.x sound driver on a Pentium machine

    Solaris 2.5 and Solaris 2.4

    SunOS 4.1.x

    Irix 5.3

    FreeBSD 2.0

    See http://www.real.com/sysreq.html for detailed Player system requirements.

    B.7.2 RealVideo Encoder 1.0 beta 1

    is currently available for Windows 95/NT. RealVideo Encoder plug-in for Adobe Premiere is available for Macintosh Power PC, and comes with the Windows encoder.

    See http://www.real.com/products/encoder/realvideo/sysreq.html for detailed RealVideo Encoder system requirements.

    B.7.3 RealAudio Encoder 3.0

    is currently available for Windows 95/NT, Macintosh PowerPC, Linux, HP/UX, AIX, FreeBSD, Irix and Digital Unix platforms. RealAudio Encoder 3.0 beta is currently available for Solaris 2.5. RealAudio Encoder 2.0 is still available for users with a Macintosh 68040 with FPU, and for users with SunOS 4.1. The RealAudio Xtras for SoundEdit 16 are currently available only with RealAudio Encoder 2.0.

    See http://www.real.com/products/encoder/realaudio/sysreq.html for detailed RealAudio Encoder system requirements.

    B.8 StreamWorks, Xing Technologies

    You first need a StreamWorks Encoder/Transmitter. It is a PC box we ship to your address DHL. It comes either in a PC Box form or a rack-mount configuration. It enables you to make your source streams from standard AV inputs (RCA, etc.) A full audio/video system costs $6,500. It is fully PAL compatible. The Audio and Video Transmitter is the one you would probably utilize for your audio and video broadcasts.

    In order to send streams across the internet or LANs or WANs one needs our StreamWorks Server, which is software that you download from our site here that sits on your web server machine, UNIX box, or whatever you are currently using. Our products handle everything from 8.5 kbps (which is AM quality audio (mono) all the way up to 112 kbps (which handles CD quality audio, or 30 frames per second of video MPEG-1 (VHS quality) with AM type audio on top of that.

    B.9 VXtreme client/server

    A PC application based on a proprietary codec running over modem speeds (14.4 kbps, 28.8 kbps).

    B.10 The multicast Media-On-Demand/mMOD system, Luleå Tekniska Universitet

    http://ctrl.cdt.luth.se/~peppar/progs/mMOD/

    MBone audio, video, whiteboard, NetText, mWeb or any other UDP multicast or unicast session can be recorded and played back in 2 modes; raw UDP or parsed RTP. In parsed RTP recording it checks for duplicates & out-of-order packets and rearrange packets based on sequence numbers. All packets get a timestamp when recorded.

    The mMOD system consists of distributed stand-alone VCR-programs and distributed Web controller-interfaces.

    Messages are sent on a special multicast/port-pair or TCP/port.

    Multicasted messages:

    alive_message = All active VCR-programs respond with its unique ID, session name, owner, playing/recording, media and its control-port number.
    Multicasted or direct messages:

    stop, pause, continue, skip, rewind, jump
    No group control of a multicasted session possible. Only the owner controls a session.

    The mMOD is completely written in Java v1.0.2 and is available for SUN/Solaris and Windows95/NT4.0.

    B.10.1 The VCR-program

    is a stand-alone program and also includes a minimal HTTP-server listening on the control-port. This enables 1) a Java-client that is directly connected to the VCR 2) use of the mWeb slide-show presentation feature

    B.10.2 The Web-controller

    uses a MIME-type application/x-sdp and SDP-files for launching appropriate helper applications on the client-side. The Web-controller has CGI-based and Java user interfaces.

    B.11 VivoActive from Vivo Software

    Windows 95 or NT, PowerMac. H.263 video compression and G.723 audio compression. Once the original .AVI file is converted to a .VIV file, the Web site developer simply uses the HTML <EMBED> command to embed the VIVO file into the Web page, the same way he or she would embed a .GIF or .JPG file. Free VivoActive Players for a variety of browsers and platforms (plug-ins for Netscape Navigator on the Macintosh, Windows 95, Window NT, Windows 3.1, and an ActiveX control for Microsoft Internet Explorer on Windows 95 or NT). The video is streamed from the Web server via HTTP over TCP, just like the rest of the page.

    B.12 IP/TV from Precept Software

    http://hydra.precept.com/products/ip-tv.htm

    IP/TV uses RTP over standard IP Multicast to deliver full-screen, full-motion video to desktop PCs over private IP routed networks and the Internet. IP/TV has three elements:

  • the Program Guide, which schedules programs, controls user access and manages bandwidth usage. Runs on any NT or Unix WWW-server.
  • the Video Server, which delivers live or prerecorded video from such devices as cameras and VCRs according to schedules in the Program Guide
  • the Viewer, which presents the user with a list of scheduled programs.

    B.13 NetShow 2.0 from Microsoft

    Microsoft NetShow 2.0 provides an easy, powerful way to stream multimedia across the Internet and intranets. It is a complete, high-performance system for broadcasting live and on-demand audio, video and mixed multimedia. NetShow technology is built on open industry standards. NetShow supports the ASF file format, which is capable of using media in MOV, AVI, WAV, BMP, DIB, JPEG and other standard formats. NetShow also supports the H.263 video standard, as well as transport and protocol standards such as UDP/IP, TCP/IP, HTTP, RTP and IP multicast. NetShow 2.0 beta software is available now without charge from the Microsoft Web site at http://www.microsoft.com/netshow/

    B.14 Video Conference Recorder (VCR) from UCL

    Stuart Clayman, A Video Conference Recorder - Design and implementation, Department of Computer Science, University College London, December 1995.

    Record, and Play UDP, TCP and IP packets. Telnet-based user interface.

    B.15 VDOLive from VDONet Corporation

    No data yet

    B.16 Vosaic

    No data yet

    B.17 MMCR

    No data yet


    Maintained by Tobias Öbrink