The best file formats for use in Q are those that contain a lot of metadata. These files greatly reduce the amount of time taken to analyze data in Q as well as decrease the chance of mistakes being made in analysis.
This article evaluates the types of file formats that are supported by Q.
Good data file formats
The following data file formats are typically the best for use in Q.
- SPSS Dimensions (Data Collection Model) data files, which are created by the more modern SPSS data collection packages. These files have the file extension (
*.mdd). Note that Q expects an .mdd file and a .ddf file. Some .mdd files will come with a different type of data file (e.g. .sav, .csv, .sss). In these cases you should use the other data file directly. Other .mdd files have no extra data file and instead refer to a database. You will need to connect to these databases directly from Q. MDD and Triple S files have the best metadata, and will be best at setting up your questions correctly, and MDD files store data efficiently.
- Triple-S data files: This is a standard and open data file format for survey data. Q supports the Triple-S version 2.0 , which can describe both fixed-format and CSV data files. Triple-S data will come in two files. One file contains the metadata and will have a file extension of
.sssand a second file containing the raw data, either as a fixed format (typically
.csvfile). Please note: not all .xml files are Triple-S files (the easiest way to check is to try and import into Q). Where you have hierarchical data files, these are set up in Q by importing each as a separate Triple-S data file and then using Q's facilities for Multiple Data Files to link them together (i.e., Q does not use the Triple-S control files for hierarchical data). Additionally, the Triple-S file format is not efficient in terms of how much space it uses, so for huge projects, it can be beneficial to use another format. Triple S and MDD files have the best metadata and will be best at setting up your questions correctly. Please note that when loading .sss files, both the .sss and the associated data file must have the same file names, with the exception of the extension.
- SPSS data files (
.sav): This is the most widely available of the good formats. There are lots of different ways that these files can be created and some are much better than others; see How to Format an SPSS File for Use in Q. SPSS .sav files load faster than the other two, and thus are good for large numbers of cases.
- SPSS Syntax files: This is the worst of the “good” file types but is the easiest to create if the software being used to export the data cannot export in one of the previous three file formats. Q only supports a subset of SPSS syntax (see Creating SPSS Syntax Files for Use in Q).
Sub-optimal data file formats
These file formats do not contain metadata (i.e., there is nothing in them to indicate if a 1 means, say, Male) and, as a consequence, their use will greatly increase the amount of setup required within Q.
- CSV files (
.csv): See Tips to Get an Excel or CSV File Working Well in Q for recommendations on how to set up CSV files and How to Use Excel and CSV Files in Q for an overview.
- Excel spreadsheets (.xls and .xlsx): These work identically to CSV files; see the extra documentation referred to above. Note that Q will only read the data from the first sheet in your Excel workbook and that the spreadsheet should have no blank rows or columns before the data, and should contain at most one row of headings. (Although Q version 4.9+ can ignore blank rows or columns.)
- SQL databases, including Microsoft SQL Server, MySQL and Oracle.
- Access and Excel file databases
How to Format an SPSS File for Use in Q
How to Use Excel and CSV Files in Q
Article is closed for comments.