Large spatial databases often labeled as geospatial big data exceed the capacity of commonly used computing systems as a result of data volume, variety, velocity, and veracity. Additional problems also labeled with V’s are cited, but the four primary ones are the most problematic and focus of this chapter (Li et al., 2016, Panimalar et al., 2017). Sources include satellites, aircraft and drone platforms, vehicles, geosocial networking services, mobile devices, and cameras. The problems in processing these data to extract useful information include query, analysis, and visualization. Data mining techniques and machine learning algorithms, such as deep convolutional neural networks, often are used with geospatial big data. The obvious problem is handling the large data volumes, particularly for input and output operations, requiring parallel read and write of the data, as well as high speed computers, disk services, and network transfer speeds. Additional problems of large spatial databases include the variety and heterogeneity of data requiring advanced algorithms to handle different data types and characteristics, and integration with other data. The velocity at which the data are acquired is a challenge, especially using today’s advanced sensors and the Internet of Things that includes millions of devices creating data on short temporal scales of micro seconds to minutes. Finally, the veracity, or truthfulness of large spatial databases is difficult to establish and validate, particularly for all data elements in the database.
|Title||Problems of Large Spatial Databases|
|Authors||E. Lynn Usery|
|Publication Type||Book Chapter|
|Publication Subtype||Book Chapter|
|Record Source||USGS Publications Warehouse|
|USGS Organization||National Geospatial Program; Center for Geospatial Information Science (CEGIS)|