Archive for year: 2018
Data Overview: The Origin and Destination Survey (DB1B)
/in Data Highlights /by hcen001The Airline Origin and Destination Survey dataset, also known as DB1B, is a 10% random sample of airline tickets data collected by the Office of Airline Information of the Bureau of Transportation and Statistics from reporting carriers, which comprise the domestic carriers with over 1% of domestic enplanements (See Table 1). The data are provided to the Department of Transportation by U.S.-based (domestic) carriers, reflecting U.S. airline and code-share partner (foreign) airline routes, and contain coupon specific information for each itinerary of the DB1B Survey.
DB1B was created specifically to determine patterns of air traffic, air carrier market shares, and passenger flows. There are 25 years of available data (from 1993 up through 2018), aggregated by quarters. The dataset is divided into three tables: Coupon, Market, and Ticket. Before diving into the detailed description of each of these tables and its attributes, it is important to first understand the terminology and the relationship between the tables.
Table 1. List of carriers included in DB1B (as of 2018)
Code | Description |
9E | Endeavor Air Inc. |
AA | American Airlines Inc. |
AS | Alaska Airlines Inc. |
B6 | JetBlue Airways |
CP | Compass Airlines |
DL | Delta Air Lines Inc. |
EV | ExpressJet Airlines Inc. |
F9 | Frontier Airlines Inc. |
G4 | Allegiant Air |
G7 | GoJet Airlines LLC d/b/a United Express |
HA | Hawaiian Airlines Inc. |
MQ | Envoy Air |
NK | Spirit Air Lines |
OH | PSA Airlines Inc. |
OO | SkyWest Airlines Inc. |
QX | Horizon Air |
SY | Sun Country Airlines d/b/a MN Airlines |
UA | United Air Lines Inc. |
VX | Virgin America |
WN | Southwest Airlines Co. |
YV | Mesa Airlines Inc. |
YX | Republic Airline |
The Ticket table (as shown in Table 2) provides the most basic information about an itinerary, which includes origin airport, origin city, fare, miles flown, reporting carrier, and a flag indicating whether or not the trip is a round-trip. DB1B Market focuses on the characteristics of each directional market, shown in Table 3. For example, if in the Ticket table an itinerary is classified as a round-trip, the Market table should contain two entries detailing each trip — origin and destination — along with flight details such as miles flown, market fare, and carrier change indicator. The Coupons table, illustrated in Table 4, provides all the details from the other two tables together with any breaks along the trip.
Table 2. DB1B Ticket Metadata
Attribute | Description |
ItinID | Itinerary ID |
Coupons | Number of Coupons in the Itinerary |
Year | Year |
Quarter | Quarter (1-4) |
Origin | Origin Airport Code |
OriginAirportID | Origin Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. |
OriginAirportSeqID | Origin Airport, Airport Sequence ID. An identification number assigned by US DOT to identify a unique airport at a given point of time. Airport attributes, such as airport name or coordinates, may change over time. |
OriginCityMarketID | Origin Airport, City Market ID. City Market ID is an identification number assigned by US DOT to identify a city market. Use this field to consolidate airports serving the same city market. |
OriginCountry | Origin Airport, Country |
OriginStateFips | Origin Airport, State FIPS |
OriginState | Origin Airport, State |
OriginStateName | Origin State Name |
OriginWac | Origin Airport, World Area Code |
RoundTrip | Round Trip Indicator (1=Yes) |
OnLine | Single Carrier Indicator (1=Yes) |
DollarCred | Dollar Credibility Indicator |
FarePerMile | Itinerary Fare Per Miles Flown in Dollars (ItinFare/MilesFlown). |
RPCarrier | Reporting Carrier |
Passengers | Number of Passengers |
ItinFare | Itinerary Fare Per Person |
BulkFare | Bulk Fare Indicator (1=Yes) |
Distance | Itinerary Distance (Including Ground Transport) |
DistanceGroup | Distance Group, in 500 Mile Intervals |
MilesFlown | Itinerary Miles Flown (Track Miles) |
ItinGeoType | Itinerary Geography Type |
Table 3: DB1B Market Metadata
Attribute | Description |
ItinID | Itinerary ID |
MktID | Market ID |
MktCoupons | Number of Coupons in the Market |
Year | Year |
Quarter | Quarter (1-4) |
OriginAirportID | Origin Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. |
OriginAirportSeqID | Origin Airport, Airport Sequence ID. An identification number assigned by US DOT to identify a unique airport at a given point of time. Airport attributes, such as airport name or coordinates, may change over time. |
OriginCityMarketID | Origin Airport, City Market ID. City Market ID is an identification number assigned by US DOT to identify a city market. Use this field to consolidate airports serving the same city market. |
Origin | Origin Airport Code |
OriginCountry | Origin Airport, Country Code |
OriginStateFips | Origin Airport, State FIPS Code |
OriginState | Origin Airport, State Code |
OriginStateName | Origin State Name |
OriginWac | Origin Airport, World Area Code |
DestAirportID | Destination Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. |
DestAirportSeqID | Destination Airport, Airport Sequence ID. An identification number assigned by US DOT to identify a unique airport at a given point of time. Airport attributes, such as airport name or coordinates, may change over time. |
DestCityMarketID | Destination Airport, City Market ID. City Market ID is an identification number assigned by US DOT to identify a city market. Use this field to consolidate airports serving the same city market. |
Dest | Destination Airport Code |
DestCountry | Destination Airport, Country Code |
DestStateFips | Destination Airport, State FIPS Code |
DestState | Destination Airport, State Code |
DestStateName | Destination State Name |
DestWac | Destination Airport, World Area Code |
AirportGroup | Airport Group |
WacGroup | World Area Code Group |
TkCarrierChange | Ticketing Carrier Change Indicator (1=Yes) |
TkCarrierGroup | Ticketing Carrier Group |
OpCarrierChange | Operating Carrier Change Indicator (1=Yes) |
OpCarrierGroup | Operating Carrier Group |
RPCarrier | Reporting Carrier Code |
TkCarrier | Ticketing Carrier Code for On-line Itineraries (otherwise equal to 99) |
OpCarrier | Operating Carrier Code for On-line Itineraries (otherwise equals to 99) |
BulkFare | Bulk Fare Indicator (1=Yes) |
Passengers | Number of Passengers |
MktFare | Market Fare (ItinYield*MktMilesFlown) |
MktDistance | Market Distance (Including Ground Transport) |
MktDistanceGroup | Distance Group, in 500 Mile Intervals |
MktMilesFlown | Market Miles Flown (Track Miles) |
NonStopMiles | Non-Stop Market Miles (Using Radian Measure) |
ItinGeoType | Itinerary Geography Type |
MktGeoType | Market Geography Type |
Table 4: DB1B Coupon Metadata
Attribute | Description |
ItinID | Itinerary ID |
MktID | Market ID |
SeqNum | Coupon Sequence Number |
Coupons | Number of Coupons in the Itinerary |
Year | Year |
OriginAirportID | Origin Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. |
OriginAirportSeqID | Origin Airport, Airport Sequence ID. An identification number assigned by US DOT to identify a unique airport at a given point of time. Airport attributes, such as airport name or coordinates, may change over time. |
OriginCityMarketID | Origin Airport, City Market ID. City Market ID is an identification number assigned by US DOT to identify a city market. Use this field to consolidate airports serving the same city market. |
Quarter | Quarter (1-4) |
Origin | Origin Airport Code |
OriginCountry | Origin Airport, Country Code |
OriginStateFips | Origin Airport, State FIPS Code |
OriginState | Origin Airport, State Code |
OriginStateName | Origin State Name |
OriginWac | Origin Airport, World Area Code |
DestAirportID | Destination Airport, Airport ID. An identification number assigned by US DOT to identify a unique airport. Use this field for airport analysis across a range of years because an airport can change its airport code and airport codes can be reused. |
DestAirportSeqID | Destination Airport, Airport Sequence ID. An identification number assigned by US DOT to identify a unique airport at a given point of time. Airport attributes, such as airport name or coordinates, may change over time. |
DestCityMarketID | Destination Airport, City Market ID. City Market ID is an identification number assigned by US DOT to identify a city market. Use this field to consolidate airports serving the same city market. |
Dest | Destination Airport Code |
DestCountry | Destination Airport, Country Code |
DestStateFips | Destination Airport, State FIPS Code |
DestState | Destination Airport, State Code |
DestStateName | Destination State Name |
DestWac | Destination Airport, World Area Code |
Break | Trip Break Code |
CouponType | Coupon Type Code |
TkCarrier | Ticketing Carrier Code |
OpCarrier | Operating Carrier Code |
RPCarrier | Reporting Carrier Code |
Passengers | Number of Passengers |
FareClass | Fare Class Code. Value Is Defined By Carriers And May Not Follow The Same Standard. Not Recommended For Analysis. |
Distance | Coupon Distance |
DistanceGroup | Distance Group, in 500 Mile Intervals |
Gateway | Gateway Indicator (1=Yes) |
ItinGeoType | Itinerary Geography Type |
CouponGeoType | Coupon Geography Type |
Below is a detailed example illustrating the relationship between the tables. Let’s say a traveler purchases a round-trip ticket from Miami, Florida to Austin, Texas as shown in Figure 1. The Ticket table, shown in Table 7, records the origin airport along with the total fare price, the total distance traveled, and an indicator specifying if the trip was a round-trip. Table 6 illustrates the Market table, which further details the destination airport along with the total distance flown in each direction. Notice that the trip from Miami to Austin has two market coupons. That is because, on the way to Austin, the traveler needed to make a connection flight at Atlanta airport. The details about the connection flight are, therefore, specified in the Coupon table (Table 5).
Table 5: DB1B Coupon table example
ItinID | MktID | SeqNum | OriginAirport | DestAirport | Distance | … |
200712 | 2007113 | 1 | MIA | ATL | 597 | … |
200712 | 2007113 | 2 | ATL | AUS | 819 | … |
200712 | 2007114 | 3 | AUS | MIA | 1,106 | … |
Table 6: DB1B Market table example
ItinID | MktID | MktCoupons | OriginAirport | DestAirport | Distance | … |
200712 | 2007113 | 2 | MIA | AUS | 1,416 | … |
200712 | 2007114 | 1 | AUS | MIA | 1,106 | … |
Table 7: DB1B Ticket table example
ItinID | Coupons | OriginAirport | Fare | Distance | Roundtrip | … |
200712 | 3 | MIA | 550 | 2,522 | 1 (Yes) | … |
Although the DB1B dataset provides some helpful information on the airline market, there exist some limitations that need to be taken into account when working with this data:
- The DB1B dataset records information helpful in determining air traffic patterns, air carrier market shares, and passenger flows. Therefore, air traffic is recorded without departure and arrival times or the purchase time of tickets. This limits the potential use of the dataset, especially the joint analysis with other datasets because the coupons cannot be used to trace back the flight number.
- The DB1B dataset includes only a 10% sample of all the airline tickets from reporting carriers, which is a very small percentage of the entire population. The data size remains sufficient for overall analysis, but this can limit the validity of discovered patterns for a small segment of the data, e.g., the number of samples in specific airports can be very limited, which is the case of Rutland, VT; Jackson, TN; Tyler, TX, and Inyokern, CA.
- The DB1B data is aggregated by quarters; no monthly details can be recovered. Additionally, the time lag of the data release is at least three to four months.
A few small U.S. regional airlines — e.g., Central Airlines, Allegheny Airlines — are not required to report their data.
Featured Publication: A study found the negative effect of the competition on price dispersion
/in Featured publication /by spouy001In the paper entitled: “Does competition reduce price dispersion? New evidence from the airline industry”, Gerardi and Shapiro [1], financial economists from the Federal Reserve Bank of Atlanta and the Bureau of Economic Analysis, respectively, investigated how competition affects the dispersion and discrimination of airline prices.
Price dispersion is an economics term that refers to the fluctuation of prices across markets and sellers of identical items. The difference between price discrimination and dispersion is that the former refers to a situation when a single seller charges different customers different prices for an identical product, while the latter involves multiple sellers that charge different prices for the same product.
Data Highlights: An in-depth review of the Origin and Destination Survey (DB1B)
/in Data Highlights /by Tianyi WangThe Airline Origin and Destination Survey dataset, also known as DB1B, is a 10% random sample of airline tickets data collected by the Office of Airline Information of the Bureau of Transportation and Statistics. DB1B was created specifically to determine patterns for air traffic, air carrier market shares, and passenger flows. Here we provide some insightful analysis based on this dataset. All the content will be presented in the form of a Jupyter Notebook. You can find the notebook here.
Farelogix and Florida International University Establish Airline Data Analytics and Decision Sciences Program
/in News /by hcen001Farelogix and Florida International University (FIU) today announced the formation of the new Data Science for Airlines Informatics (DSAI) program, an advanced training program focused on airline data analytics and decision sciences. Housed at FIU as part of the university’s School of Computing and Information Sciences, the program will be dedicated to the study and implementation of advanced predictive analytics for the airline industry, using machine learning and artificial intelligence (AI). Graduate students will work collaboratively with the Farelogix data and analytics team to develop new data models for optimizing airline revenue management and generating dynamic offers. The program will open in May 2018.