Puffer

Along with our research paper, we are publishing anonymized data collected on Puffer for the research community to investigate. As our experiments are ongoing, new data is collected each day. This data is posted daily to the Experiment Results page, which also contains all data collected since the experiment began in January 2019. On this page, we provide a brief description. Please see the README in the puffer-statistics repo for more details on the data analysis.

Raw Data

A single day of data is several GB, so please download a small set of fake data first to determine if the full Puffer data is indeed what you need. We would also be grateful if you could download the data from our server only once. If anything is not clear in the below data description, please don't hesitate to post a question in our Google Group.

At a high level, each day's Puffer data comprises different "measurements" — each measurement contains a different set of time-series data collected on Puffer servers, and is dumped as a CSV file. The CSV files that are essential for analysis include video_sent_X.csv, video_acked_X.csv, and client_buffer_X.csv, where X represents the day when the data was collected. For example, "2019-11-04T11_2019-11-05T11" means the data was collected between 2019-11-04T11:00:00Z and 2019-11-05T11:00:00Z (UTC is the default time zone). In addition to these three CSVs, we also release video_size_X.csv and ssim_X.csv that are described below.

A special field in many CSV files is expt_id. This is a unique ID identifying information associated with a "scheme", or pair of ABR and congestion control algorithms such as Fugu/BBR. The expt_id can be used as a key to retrieve the associated settings (e.g. algorithms and git commit) in the logs/expt_settings file. Each day has its own logs/expt_settings, containing the settings of all schemes run on Puffer between January 2019 and that day (as well as later days, if the analysis was performed later). If an expt_id were missing in the file, it would suggest an out-of-date file. The csv_to_stream_stats program in the puffer-statistics repo provides a function to parse this file. Note that the research paper also uses the term "experiment" to refer to a group of schemes, e.g. the "primary experiment", whereas the expt_id refers to a single scheme.

Additionally, there are two terms that we will use in the description: "stream" and "session". When a Puffer client watches TV for the first time or reloads the player page, it starts a new "session", identified by session_id in the CSVs. When a client switches channels, it enters into a different "stream" but still remains in the same "session", which uses the same TCP connection. Each CSV contains an index field solely used to group streams. Two datapoints are considered part of the same stream if and only if they share both session_id and index. The values of session_id and index are not meaningful otherwise.

Finally, the data format of each type of file:

logs/expt_settings: Each line begins with the expt_id and contains the corresponding configuration such as congestion control, ABR algorithm, and associated parameters. Note that the field abr_name is simply a meaningful name for a specific combination of abr and abr_config.
video_sent_X.csv: Each line represents a data point collected when the Puffer server sent a video chunk to a client. The line contains the fields below in order:
- time: timestamp (nanoseconds since Unix epoch) when the chunk is sent
- session_id: unique ID for the video session
- index
- expt_id
- channel: TV channel name. Recall that a different channel suggests a different stream given the same session_id.
- video_ts: presentation timestamp of the chunk. Each video chunk is always 2.002 seconds long, i.e., 180180 at 90 kHz. Therefore, a continuous playback in a stream would require video_ts to increment by 180180 in each chunk.
- format: encoding settings of the chunk, represented as "WxH-CRF", namely the width, height, and constant rate factor (CRF) of the video
- size: chunk size in bytes
- ssim_index: SSIM of the chunk (relative to a canonical version of the chunk). Note that it is unitless (instead of in dB).
- cwnd: congestion window size in packets (tcpi_snd_cwnd)
- in_flight: number of unacknowledged packets in flight (tcpi_unacked - tcpi_sacked - tcpi_lost + tcpi_retrans)
- min_rtt: minimum RTT in microseconds (tcpi_min_rtt)
- rtt: smoothed RTT estimate in microseconds (tcpi_rtt)
- delivery_rate: TCP's estimation of delivery rate in byte/second (tcpi_delivery_rate)
- buffer: playback buffer size in seconds*
- cum_rebuf: total time in seconds that the client has spent rebuffering in the current stream*
video_acked_X.csv: Each line represents a data point collected when the Puffer server receives a video chunk acknowledgement from a client. Note that this is not when a chunk is received by the client. It contains the fields below in order:
- time: timestamp (nanoseconds since Unix epoch) when the server receives a chunk acknowledgement from the client
- session_id
- index
- expt_id
- channel
- video_ts: presentation timestamp of the chunk, which can be used to find the matching chunk in video_sent_X.csv. The presentation timestamp of a channel is usually monotonously increasing but will be reset to 0 from time to time (e.g., when Puffer's encoding pipeline is restarted). Therefore, it is not guaranteed to be universally unique, but duplicate matches within a short period of time are extremely rare in practice.
- buffer: playback buffer size in seconds*
- cum_rebuf: total time in seconds that the client has spent rebuffering in the current stream*
client_buffer_X.csv: Each line represents a data point containing a message reported by the client, when certain events occur and on a regular interval. It contains the fields below in order:
- time: timestamp (nanoseconds since Unix epoch) when the server receives the client message
- session_id
- index
- expt_id
- channel
- event: type of event reported by the client includes
  - init: when the client switches to a new channel
  - startup: when the client starts playing a channel. The difference between init and startup is referred to as the "startup delay" of that stream.
  - rebuffer: when the client rebuffers
  - play: when the client resumes playing from rebuffering (this is a legacy name and would better be called resume)
  - timer: periodic event reported by the client, designed to fire every 250 ms. In practice, however, the interval is not reliable and can often be inevitably longer, e.g., when the client's browser tab becomes inactive.
- buffer: playback buffer size in seconds*
- cum_rebuf: total time in seconds that the client has spent rebuffering in the current stream*
video_size_X.csv & ssim_X.csv: These two files contain the sizes and SSIMs of all video chunks encoded by Puffer for streaming during a single day. For each TV channel, every video chunk should be available in 10 formats (the bitrate ladder), unless cut off at the day's boundaries. Each line consists of the following fields (refer to video_sent_X.csv for any fields not explained here):
- time: timestamp (nanoseconds since Unix epoch) when an encoded video chunk is reported to the server
- format
- channel
- video_ts
- size (in video_size_X.csv) or ssim_index (in ssim_X.csv)

*Note on buffer/cum_rebuf: The server updates its record of a client’s buffer/cum_rebuf when it receives a video_acked or client_buffer message, or an audio ack. The buffer/cum_rebuf values in video_sent are those last recorded. These fields are included in video_sent_X.csv/video_acked_X.csv/client_buffer_X.csv starting with 2021-06-12T11_2021-06-13T11. Before then, video_sent_X.csv/video_acked_X.csv do not include buffer/cum_rebuf, although the values for video_sent are in separate buf_video_sent_X.csv in the buf_video_sent directory.

Plots

In addition to raw data, the Experiment Results page contains up to ten plots per day. The plots summarize the relative performance over several time periods of the schemes run on that day. Each time period in the set {day, week, two weeks, month, duration} is plotted if all schemes that ran on the day of interest (i.e. the last day of each period) ran on every day of the period. Duration is the longest contiguous period over which all schemes in the day's group ran. For each such period, we plot the SSIM and stall ratio of each scheme, averaged over all stream speeds as well as slow streams only.

The plots show 95% confidence intervals. Confidence intervals are particularly important in Puffer: As discussed in the research paper, we find that the variable and heavy-tailed nature of video streaming requires remarkably large amounts of data in order to draw statistically significant conclusions about a scheme’s performance. For instance, notice that many of the single-day plots show little significant difference between the schemes, and separation often only becomes apparent over longer time periods.

The Experiment Results page is updated daily with the most recent data available. If the page does not allow the current day to be selected, its data may not yet be available; please check again later in the day.

Other Results

The Experiment Results page links to the storage bucket containing all the anonymous data collected on each day. In addition to the raw data and plots described above, this bucket includes metadata files such as a list of days each scheme ran. Please see the puffer-statistics README for a detailed description of these files.

Data Description

Raw Data

Plots

Other Results