We’re going to use the small 198-gene set for all these enrichment tools. If you don’t have them already, they are available here:
As gene symbols (CD2, DLG4 e.t.c).
As ensembl gene ids (e.g. ENSG00000183654). Harder to read, but some tools need unambiguous IDs like this for performance reasons when working with larger genelists; e.g. DAVID background or STRING without a specified organism.
Or, some (but not all) tools will let you copy-paste in your genes from an excel spreadsheet (below). Other tools still require you to upload a text file.
Try gProfiler GOSt for functional enrichment: https://biit.cs.ut.ee/gprofiler/gost
Use the 198 differentially expressed gene list. Check out the results, and keep them open for the next step.
Note: Set the background gene list with Advanced Options > Statistical Domain scope > Custom over annotated Genes . This includes anything in your background that has any annotation.
Note: Under advanced options there is also an option to calculate underrepresentation - a lack of a certain term in the query list. There’s not enough statistical power to show anything for this example.
Next try PANTHER : http://www.pantherdb.org/
Under Select Analysis choose the Statistical Overrepresentation test , with the ‘GO Biological Process (BP) complete’ annotation set.
PANTHER prompts for the background set after hitting next. Under Upload List Browse for the text file (linked above) that contains background genes in ensemblID format.
Once one test has been run, you can try out other annotation sets (or parameters) from the dropdown list on the results page - different GO sets, PANTHER pathways e.t.c
Explore protein-protein interactions with STRING: https://string-db.org
Paste the 198 differentially expressed genes into the ‘Multiple proteins’ option, and select human. You can accept gene mappings on the next screen with ‘continue’.
No background needed for this one, since we’re not interested in calculating the enrichment.
Explore enrichment in the the pathway browser of reactome : https://reactome.org/
Go to ‘analyse data’ and paste the 198 differentially expressed genes. No background.
Click through ‘Syndecan Interactions’, expand it one the left hand tree, then click one of the subpathways to zoom in on the pathway view.
Try calculating functional enrichment in DAVID: https://david.ncifcrf.gov/
NB: Need to use ensembl Ids (e.g. ENSG00000170075) for a background in DAVID, it won’t allow gene names due to potential ambiguity. Select “ENSEMBL_GENE_ID” for id type.