USGS Data Management
|
|
Data TemplatesCreating Data Templates for data collection, data storage, and metadata saves time and increases consistency. Utilizing form validation increases data entry reliability.
Why Use a Data Template?Key Points
Prior to the data collection process, a plan should be developed for each step of the data collection, data entry, and data storage process. In general, more than one person is involved in any of these three activities and potential human error can reduce the quality of the data. Therefore, creating templates allows these processes to be controlled and standardized. Templates During Data EntryBefore data collection, planning how the data will be recorded into the template is vital. This includes ensuring that the template is consistent, has clearly defined headers, rows, and terms, and has mechanisms to reduce human error. These mechanisms are typically called forms within a data processing program such as Microsoft Excel, Microsoft Access, Google Forms, or OpenOffice Calc (see "Tools" below for more information). Forms can be created in any of these programs so that they only allow a certain data type to be entered into a field. If the data are entered incorrectly, the form will reject such data entry. Forms can also constrain fields in the form of pull-down menus which use a controlled vocabulary to ensure that the data are entered in the correct locations. When creating the form, give special consideration to how to label the columns and headers for the form. Good values are ones that are long enough to describe the data included in the cells below or next to it, but are short enough to make readability easier by either humans or machines. Avoid using spaces or special characters (like * or +) in the name, as it may cause issues when inputting the data into a database. When entering data that has a unit associated with it, be sure to include that unit in the header (such as "Height (cm)"). Also, include the datum for any geographical data (such as "NAVD88"). After Data EntryAfter data have been collected, there are several things that can be done to check data entry. Either while the data are being entered in the first time, or afterwards, consider having another person reenter in the data. This facilitates a comparison between the data entered to make sure they are a match. Use a program to read the data back. This is useful as it allows one to check the data that have been entered into the system against the original dataset. Check the beginning and ending portions of your dataset for errors, and then randomly spot-check other values throughout the form. Lastly, consider graphing or mapping the data to ensure there are no unexpected outliers. When using a template to store data, various spreadsheet programs can help process and visualize the data. However, some of these processes can cause problems. One example is that it is possible to select and sort a user-specified column from A-Z or vice-versa. Unless one selects the option to include all other columns, only one column will be sorted, leaving the others in an unsorted state. This means the connections between the various columns will be lost. Data Storage and MetadataTemplates for data storage and metadata are also useful for standardizing procedures. Templates for data storage specify how to name files and folders consistently and stipulate where the data should be kept, either for short- or long-term storage. Similarly, metadata should have a standard template. As notes and information about the dataset are added over time and by various people on the team, creating a standard metadata template that specifies how this information is entered will keep the metadata organized and easily readable. Best Practices
Disclaimer: Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Tools
References
|