Skip to main content
U.S. flag

An official website of the United States government

Multivariate classification of the crude oil petroleum systems in southeast Texas, USA, using conventional and compositional data analysis of biomarkers

June 1, 2021

Chemically, petroleum is an extraordinarily complex mixture of different types of hydrocarbons that are now possible to isolate and identify because of advances in geochemistry. Here, we use biomarkers and carbon isotopes to establish genetic differences and similarities among oil samples. Conventional approaches for evaluating biomarker and carbon isotope relative abundances include statistical techniques such as principal component and cluster analysis. Considering that proportions of the different hydrocarbon molecules are relative parts of a laboratory sample, the data are compositional in nature, thus requiring the use of log-ratio approaches for adequate mathematical modeling. We apply both traditional and compositional modeling approaches to crude oil samples from an onshore area of about 50,000 square miles in southeast Texas. The data comprise 177 crude oil samples from producing oil fields that include key biomarkers, elemental, and isotopic values commonly used in source rock correlation studies. Our results indicate that compositional modeling has higher discriminating power and lower uncertainty than the traditional approach, allowing the identification of up to 16 clusters. Each cluster represents one oil family from a source rock organofacies ranging from Carboniferous to Paleogene. The families provide new insights into important petroleum systems in the Texas onshore region of the Gulf of Mexico sedimentary basin.