Appearance
Statistics CSV-file ā
What is a statistics file?
A statistics file contains the results of comparisons between groups (e.g. treated vs control) for each molecule, such as fold changes and p-values.
How to format the statistics file before uploading to Omics Studio ā
Your file must be a CSV file with:
| ID | Comparison | Fold_change | Log2_Fold_change | p_value | Adjusted_p_value | ... |
|---|
The following columns are required (case-insensitive):
IDComparisonFold ChangeLog2 Fold changeP-valueAdjusted P-value
Optional: you can add extra columns (e.g., metadata, annotations, etc.)
Example table
š Statistics (.csv)
| ID | Comparison | Fold_change | Log2_Fold_change | p_value | Adjusted_p_value | ... |
|---|---|---|---|---|---|---|
| A0A1B0GTW7 | Treated vs Control | 1.45 | 0.84 | 0.011 | 0.044 | ... |
| H3BV60 | Treated vs Control | -0.84 | -0.25 | 0.091 | 0.102 | ... |
| O14717 | Treated vs Control | 0.22 | -2.18 | 0.943 | 0.121 | ... |
| P05067 | Treated vs Control | 1.12 | 0.17 | 0.051 | 0.943 | ... |
How Molecule IDs and Comparisons must be paired ā
- Only one type of molecule ID (
UniProt,Ensembl,CHEBI, orInChI) can be used per file ā mixing molecule ID types is not allowed. - Each row represents a unique combination of a molecule ID and a Comparison combination.
- The same molecule ID (e.g., H3BV60) can appear in multiple comparisons ā for example, X vs Y and X vs Z.
- However, each molecule ID + comparison pair must be unique. That means you cannot have two rows with the same molecule ID for the same comparison (e.g., two entries for H3BV60 in X vs Y).
Example
š Statistics (.csv)
| ID | Comparison | ... |
|---|---|---|
| A0A1B0GTW7 | Treated vs Control | ... |
| H3BV60 | Treated vs Control | ... |
| A0A1B0GTW7 | Unreated vs Treated | ... |
| H3BV60 | Untreated vs Treated | ... |
ā ļø If uploading several statistics files
Make sure that there is no dublicates across files, where the same ID and Group Comparison is mentioned. E.g. If UniProt ID O14717 combined with comparison X vs Y is in several files, you will not be able to create a study (including both files).
Show example
Statistics CSV file 1
| ID | Comparison | Fold_change | p_value | Adjusted_p_value |
|---|---|---|---|---|
| O14717 | X vs Y | 1.23 | 0.004 | 0.015 |
| H3BV60 | X vs Y | -0.88 | 0.041 | 0.089 |
| A0A1B0GTW7 | X vs Z | 0.35 | 0.120 | 0.200 |
| Q8IZX4 | A vs B | -1.12 | 0.009 | 0.030 |
Statistics CSV file 2
| ID | Comparison | Fold_change | p_value | Adjusted_p_value |
|---|---|---|---|---|
| O14717 (ā duplicate) | X vs Y (ā duplicate) | 1.19 | 0.006 | 0.020 |
| H3BV60 | Z vs X | 0.91 | 0.033 | 0.075 |
| A0A1B0GTW7 (ā duplicate) | X vs Z (ā duplicate) | 0.40 | 0.098 | 0.170 |
| Q9XYZ1 | A vs C | -0.51 | 0.210 | 0.310 |
How to fix it:
Remove the duplicates from either Statistics 1 or Statistics 2 before uploading again.
General overview of the columns ā
If in doubt how to the rules apply to the different mandatory columns, you can learn more in the following.
Molecule ID columns ā
- Must contain molecule identifiers (e.g.
UniProt,Ensembl,CHEBI, orInChI). - Only one ID type is allowed per file.
- No empty cells are allowed in this column.
- Each ID must be unique.
Example
| ID |
|---|
| A0A1B0GTW7 |
| H3BV60 |
| O14717 |
| ... |
Comparison columns ā
- Each value must represent a unique pairwise comparison, formatted as:
Examples:
Treated vs ControlSample_A vs Sample_B- You must add vs between the two.
- Format is case-insensitive (
x vs y=X vs Y). - The two parts must be different (e.g.
X vs Xis invalid). - No empty cells are allowed in this column.
Example
| Comparison |
|---|
| Treated vs Control |
| ConditionA vs ConditionB |
| Group1 vs Group2 |
Numeric columns ā
- The following columns must contain numeric values (or
NA): Fold_changep_valueAdjusted_p_value- Empty cells and
NAare treated as missing values. - Any other text (e.g.
missing,null,none) will cause an error.
Example
| Fold_change | p_value | Adjusted_p_value |
|---|---|---|
| 1.45 | 0.0025 | 0.011 |
| -0.84 | 0.045 | 0.091 |
| 0.22 | NA | NA |
Additional columns ā
- You may include as many extra columns (e.g. gene names, annotations, or statistics) as you like.
- These are not part of the validation criteria in Omics Studio when uploading, but must still have a non-empty header.
- Empty values are allowed in these columns.
š” Good to know
ā
Underscores, spaces, and hyphens are ignored (p value, p-value, or p_value are all valid). ā
Plural āsā at the end is ignored (comparisons = comparison).
ā
Decimal separators can be either . or ,.
ā Thousand separators are not allowed.
ā Empty headers or missing required columns will cause upload errors.