Author(s): Joe Solvason, Maggie Ma
Contact: Joe Solvason (solvason@ucsd.edu)
Adapted as a GenePattern Module by: Ted Liefeld (jliefeld@cloud.ucsd.edu)
Task Type: Transciption factor analysis
LSID: urn:lsid:genepattern.org:module.analysis:00443
GenerateMotifDatabase
can be used to obtain position weight matrix (PWM) data to score binding sites in other TFSites modules. This tool extracts the PFM(s) for one or more transcription factor(s) of interest and then converts each PFM to its respective PWM.
Optionally, the user can provide keywords that are present in the matrix header to filter the input PFM data. This can include the transcription factor name and/or matrix ID. For every PFM, the header line of the matrix is screened for the keywords. All PFMs that contain the keywords will be returned, regardless of letter case or spaces. The user can also choose to use pseudocounts and add the nucleotide background frequency for A, T, G, and C. The output file will be formatted the same as the input, just replacing the position frequency matrix with the position weight matrix.
* indicates required parameter
Default = None
Default = False
Default = 0.25,0.25,0.25,0.25
>MA0002.1 RUNX1
A [ 10 12 4 1 2 2 0 0 0 8 13 ]
C [ 2 2 7 1 0 8 0 0 1 2 2 ]
G [ 3 1 1 0 23 0 26 26 0 0 4 ]
T [ 11 11 14 24 1 16 0 0 25 16 7 ]
>MA0098.1 ETS1
A [ 4 17 0 0 0 5 ]
C [ 16 0 1 39 39 3 ]
G [ 4 0 0 1 0 17 ]
T [ 16 23 39 0 1 15 ]
...
Keyword
ETS
ETV
GABP
FLI
SPI
>MA0098.1 ETS1
A [ -1.053145688775872 0.6820617990250051 -2.872741175013566 -2.8727411750135663 -2.8727411750135663 -0.8153679664067708 ]
C [ 0.6022513974791806 -2.8727411750135654 -2.1656974816113896 1.8090322245000712 1.8090322245000712 -1.337998938483863 ]
G [ -1.053145688775872 -2.8727411750135654 -2.872741175013566 -2.1656974816113896 -2.8727411750135663 0.6820617990250051 ]
T [ 0.6022513974791806 1.0857746289240513 1.8090322245000716 -2.8727411750135663 -2.1656974816113896 0.5177659702222613 ]
>MA0081.1 SPIB
A [ 1.2260680794798462 -1.2837929660005913 0.8791456051256142 -1.9004643264490857 -2.347923303420307 1.8292696984860446 1.7999753916920067 ]
C [ -3.0 0.16992500144231237 0.3423921974470779 -3.0 -3.0 -3.0 -3.0 ]
G [ -3.0 1.135159583281636 -0.6780719051126377 1.7999753916920067 1.7999753916920067 -3.0 -3.0 ]
T [ 0.4964258261194989 -1.9004643264490857 -1.9004643264490857 -3.0 -2.347923303420307 -2.347923303420307 -1.9004643264490857 ]
...
Example input data is available here.