A | B |
Information | the collection of facts and patterns extracted from data |
Metadata | data about data |
Cleaning Data | a process that makes the data uniform without changing its meaning (e.g., replacing all equivalent abbreviations, spellings, and capitalizations with the same word). |
Data Filtering | choosing a smaller subset of a data set to use for analysis, for example by eliminating / keeping only certain rows in a table |
Correlation | a relationship between two pieces of data, typically referring to the amount that one varies in relation to the other. |
Citizen Science | scientific research conducted in whole or part by distributed individuals, many of whom may not be scientists, who contribute relevant data to research using their own computing devices. |
Crowdsourcing | the practice of obtaining input or information from a large number of people via the Internet |
Data Bias | refers to inaccurate, incomplete, or incompatible datasets, failing to represent the entire population. |
Data | facts and statistics collected together for reference or analysis. |
Conclusion | a summary of the results, whether or not the hypothesis was supported, the significance of the study, and future research. |
Processing data | The ability to process data based on the capabilities of the users and their tools |
Data sets | a file that contains one or more records. |
Inconsistencies | the concept where there are conflicts or different copies of the same data in the database. |
Scalability | the measure of a system's ability to increase or decrease in performance and cost in response to changes in application and system processing demands. |
CSV File | a text file format that uses commas to separate values. |