You westerners please see this picture:
which comes from http://www.loong.cn
which is: the symbol of Chinese culture is never a nasty dragon, but the Loong! We are working to correct the mistake!!
Monday, June 15, 2009
librdf Python API summary
I reviewed my code written using the Redland librdf Python API, and made a brief summary as a memoir. For more advanced and powerful parsing, I'm turning to some Java libraries, such as Jena, owlapi, Pellet, ...
Update: but I have been working with python-librdf for all the time and got my Java stuff put away...
** RDF.Model object
import RDF
model = RDF.Model(RDF.MemoryStorage())
Model using in-memory storage
model = RDF.Model(RDF.HashStorage(bdb_location,options="hash-type='bdb'"))
Model using Berkeley DB storage
p = RDF.Parser('raptor')
file_uri = RDF.Uri('file:/path/to/rdf_file')
Create a URI indicating a local file
p.parse_into_model(model,f_uri)
Parse a rdf file into model. Return boolean to indicate whether this operation is successful or not. (can parse multiple files into one model)
len(model)
Returns number of statements in model, only applies for models with in-memory storage. Won't work for Berkeley DB storage.
** RDF.Node object type
The node.type attribute is an integer indicating type of the node:
1: Resource node, can get RDF.Uri object by node.uri
2: literal node, the value of which can be extracted as node.literal_value['string']
4: blank node, which usually appears in owl as object of rdfs:subClassOf as a restriction on properties.
Better to use node.is_literal(), node.is_resource(), node.is_blank() to make judgment on node types, to avoid confusion.
** Simple query methods (bound to RDF.Model object)
All simple query methods supported by RDF.Model object can accept RDF.Node object. These methods also returns RDF.Node objects.
(model indicates RDF.Model object)
result = model.get_target(a,b)
result = model.get_predicate(a,b)
result = model.get_source(a,b)
Returns a RDF.Node object, or None upon failure
results = model.get_targets(a,b)
results = model.get_predicates(a,b)
results = model.get_sources(a,b)
Always return a RDF.Iterator object, containing the sequence of RDF.Node objects.
Iteration:
for result in results:
Check for end:
results.end() # return 0 or 1 on whether it is exhausted.
Membership test:
my_node in results # return boolean
** Not simple query methods
Create a RDF.Query object:
query = RDF.Query(query_string,query_language='xxx')
Query languages are rdql or sparql, default rdql
Sparql query with new string format syntax:
query = RDF.Query('SELECT ?s WHERE {{ ?s <{0}> <{1}> }}'.format(...),query_language='sparql')
results = query.execute(model)
results is a RDF.QueryResults object, also an iterator
Check for end:
results.finished()
for this_re in results:
print this_re['s']
** Blank node
Blank node is specially noted here because it is frequently used in collection-type domain/range declaration, and property restriction for class. All happen in OWL.
Blank node does not have uri attribute, and cannot be converted to RDF.Uri object. It can be easily used in RDF.Model object-bound queries, as they readily accepts node object as arguments.
To use it in ``not-simple'' query, Sparql query syntax (but not rdql) has to be used:
# node is a blank node
node_str = '_:'+node.blank_identifier
q = RDF.Query('SELECT ?predicate ?object WHERE {{ {0} ?predicate ?object }}'.format(node_str),query_language='sparql')
results = q.execute(model)
Problem: when working with Uniprot OWL, such a query would retrieve all the blank nodes. For example, a sparql query of blank node with predicate owl:unionOf will retrieve 26 blank nodes as objects, but a get_targets() query will retrieve only 1 blank node correctly. Will check if this is a bug or not.
Nodes with collection parseType are frequently used, eg. in domain/range specification.
Update: but I have been working with python-librdf for all the time and got my Java stuff put away...
** RDF.Model object
import RDF
model = RDF.Model(RDF.MemoryStorage())
Model using in-memory storage
model = RDF.Model(RDF.HashStorage(bdb_location,options="hash-type='bdb'"))
Model using Berkeley DB storage
p = RDF.Parser('raptor')
file_uri = RDF.Uri('file:/path/to/rdf_file')
Create a URI indicating a local file
p.parse_into_model(model,f_uri)
Parse a rdf file into model. Return boolean to indicate whether this operation is successful or not. (can parse multiple files into one model)
len(model)
Returns number of statements in model, only applies for models with in-memory storage. Won't work for Berkeley DB storage.
** RDF.Node object type
The node.type attribute is an integer indicating type of the node:
1: Resource node, can get RDF.Uri object by node.uri
2: literal node, the value of which can be extracted as node.literal_value['string']
4: blank node, which usually appears in owl as object of rdfs:subClassOf as a restriction on properties.
Better to use node.is_literal(), node.is_resource(), node.is_blank() to make judgment on node types, to avoid confusion.
** Simple query methods (bound to RDF.Model object)
All simple query methods supported by RDF.Model object can accept RDF.Node object. These methods also returns RDF.Node objects.
(model indicates RDF.Model object)
result = model.get_target(a,b)
result = model.get_predicate(a,b)
result = model.get_source(a,b)
Returns a RDF.Node object, or None upon failure
results = model.get_targets(a,b)
results = model.get_predicates(a,b)
results = model.get_sources(a,b)
Always return a RDF.Iterator object, containing the sequence of RDF.Node objects.
Iteration:
for result in results:
Check for end:
results.end() # return 0 or 1 on whether it is exhausted.
Membership test:
my_node in results # return boolean
** Not simple query methods
Create a RDF.Query object:
query = RDF.Query(query_string,query_language='xxx')
Query languages are rdql or sparql, default rdql
Sparql query with new string format syntax:
query = RDF.Query('SELECT ?s WHERE {{ ?s <{0}> <{1}> }}'.format(...),query_language='sparql')
results = query.execute(model)
results is a RDF.QueryResults object, also an iterator
Check for end:
results.finished()
for this_re in results:
print this_re['s']
** Blank node
Blank node is specially noted here because it is frequently used in collection-type domain/range declaration, and property restriction for class. All happen in OWL.
Blank node does not have uri attribute, and cannot be converted to RDF.Uri object. It can be easily used in RDF.Model object-bound queries, as they readily accepts node object as arguments.
To use it in ``not-simple'' query, Sparql query syntax (but not rdql) has to be used:
# node is a blank node
node_str = '_:'+node.blank_identifier
q = RDF.Query('SELECT ?predicate ?object WHERE {{ {0} ?predicate ?object }}'.format(node_str),query_language='sparql')
results = q.execute(model)
Problem: when working with Uniprot OWL, such a query would retrieve all the blank nodes. For example, a sparql query of blank node with predicate owl:unionOf will retrieve 26 blank nodes as objects, but a get_targets() query will retrieve only 1 blank node correctly. Will check if this is a bug or not.
Nodes with collection parseType are frequently used, eg. in domain/range specification.
Friday, June 12, 2009
Generic genome browser on Ubuntu
I'm preparing some stuff for the workshop, so I'm getting back to gbrowse again.
** Installation steps:
gbrowse version: 1.69
Ubuntu version: 9.04)
perl version: 5.10.0
bioperl version: don't know how to figure that out...
$ sudo apt-get install libapache2-mod-perl2
$ sudo apt-get install libapache2-mod-perl2-dev
$ sudo apt-get install libapache2-mod-perl2-doc
$ sudo apt-get install apache2-doc
Verify that this directory exists: /usr/lib/cgi-bin, if not, create.
$ sudo apt-get install libgd2-noxpm-dev
$ sudo apt-get install mysql-server
$ sudo apt-get install mysql-client
Use cpan to install all prerequisite Perl modules as listed in INSTALL.
Started to install gbrowse from source code:
$ perl Makefile.PL
Complained that Bio::Graphics module is old. So upgrade it using cpan. (not successful until graphviz software is installed)
$ cpan
cpan> upgrade Bio::Graphics
$ perl Makefile.PL
$ make
$ sudo make install
Finished. Really a bit surprised to see so many fancy features in this version of gbrowse. The last time I was working with gbrowse is 2006.
** It stuffed some scripts into my system directory. For example those are found in /usr/local/bin:
bp_search2alnblocks bp_search2tribe bp_seqfeature_load.pl
bp_search2alnblocks.pl bp_search2tribe.pl bp_seq_length
bp_search2BSML bp_search_overview bp_seq_length.pl
bp_search2BSML.pl bp_seqconvert bp_seqret
bp_search2gff bp_seqconvert.pl bp_seqret.pl
bp_search2gff.pl bp_seqfeature_delete.pl bp_seqretsplit.pl
bp_search2table bp_seqfeature_gff3.pl
bp_search2table.pl bp_seqfeature_load
... and much more!!
Gbrowse configuration files are located in ``/etc/apache2/gbrowse.conf/''
** use bp_seqfeature_load.pl to initialize the mysql database.
** GFF3 format file nuisance:
The gff3 file for E.coli I downloaded from NCBI was rejected by gbrowse!
The scaffold declaration entry (first row of gff contents) is different with the example gff3 file that comes along with gbrowse program. The TYPE (3rd column) has to be ``chromosome'', and there has to be ID,NAME contents in 9th column.
** Installation steps:
gbrowse version: 1.69
Ubuntu version: 9.04)
perl version: 5.10.0
bioperl version: don't know how to figure that out...
$ sudo apt-get install libapache2-mod-perl2
$ sudo apt-get install libapache2-mod-perl2-dev
$ sudo apt-get install libapache2-mod-perl2-doc
$ sudo apt-get install apache2-doc
Verify that this directory exists: /usr/lib/cgi-bin, if not, create.
$ sudo apt-get install libgd2-noxpm-dev
$ sudo apt-get install mysql-server
$ sudo apt-get install mysql-client
Use cpan to install all prerequisite Perl modules as listed in INSTALL.
Started to install gbrowse from source code:
$ perl Makefile.PL
Complained that Bio::Graphics module is old. So upgrade it using cpan. (not successful until graphviz software is installed)
$ cpan
cpan> upgrade Bio::Graphics
$ perl Makefile.PL
$ make
$ sudo make install
Finished. Really a bit surprised to see so many fancy features in this version of gbrowse. The last time I was working with gbrowse is 2006.
** It stuffed some scripts into my system directory. For example those are found in /usr/local/bin:
bp_search2alnblocks bp_search2tribe bp_seqfeature_load.pl
bp_search2alnblocks.pl bp_search2tribe.pl bp_seq_length
bp_search2BSML bp_search_overview bp_seq_length.pl
bp_search2BSML.pl bp_seqconvert bp_seqret
bp_search2gff bp_seqconvert.pl bp_seqret.pl
bp_search2gff.pl bp_seqfeature_delete.pl bp_seqretsplit.pl
bp_search2table bp_seqfeature_gff3.pl
bp_search2table.pl bp_seqfeature_load
... and much more!!
Gbrowse configuration files are located in ``/etc/apache2/gbrowse.conf/''
** use bp_seqfeature_load.pl to initialize the mysql database.
** GFF3 format file nuisance:
The gff3 file for E.coli I downloaded from NCBI was rejected by gbrowse!
The scaffold declaration entry (first row of gff contents) is different with the example gff3 file that comes along with gbrowse program. The TYPE (3rd column) has to be ``chromosome'', and there has to be ID,NAME contents in 9th column.
Monday, June 8, 2009
无题
咚了个咚咚呛!!老乡们,注意了,adug同学的博客....挪窝了!
目前地址是:http://blog.sina.com.cn/adugduzhou
基本上么,虽然还没什么内容,但个人*特色*已初露倪端,所谓``小荷才露尖尖脚,一坨鸟屎糊上头!''
不过还是很羡慕人家的那种满足感,咱什么时候也能那么呢?咱好像从来没那么过...以前听李宁老师说过(不是卖衣服的那个,是克隆牛的那个),人生是艰难的,即便是片刻的满足感也是奢侈的!当时哈哈傻笑来着,现在则是多么的同意他的灼见啊!曾经有过片刻的可供膨胀的空间,却被我当空气般视而不见,哎... 忍了~就让寂寞的空气包裹着我,拖着两条腿继续孤独的前行吧...
目前地址是:http://blog.sina.com.cn/adugduzhou
基本上么,虽然还没什么内容,但个人*特色*已初露倪端,所谓``小荷才露尖尖脚,一坨鸟屎糊上头!''
不过还是很羡慕人家的那种满足感,咱什么时候也能那么呢?咱好像从来没那么过...以前听李宁老师说过(不是卖衣服的那个,是克隆牛的那个),人生是艰难的,即便是片刻的满足感也是奢侈的!当时哈哈傻笑来着,现在则是多么的同意他的灼见啊!曾经有过片刻的可供膨胀的空间,却被我当空气般视而不见,哎... 忍了~就让寂寞的空气包裹着我,拖着两条腿继续孤独的前行吧...
Sunday, June 7, 2009
Bird nest (2)
Friday, June 5, 2009
X11 display reconfiguration for Ubuntu on Dell T300
I just upgraded Ubuntu on our Dell T300 machine to 9.04. Fancy... but we got a surprisingly slow response in gnome-terminal.
I googled, and found out that Ubuntu 9.04 is not working very well with the Intel video chipset of the machine, as discussed here.
The solution is quite simple (though the instructions kept me trying for a while). Sudo to root, edit the file /etc/X11/xorg.conf, add following line to Device section:
Option "MigrationHeuristic" "greedy"
Then logout, and re-login, solved!
I googled, and found out that Ubuntu 9.04 is not working very well with the Intel video chipset of the machine, as discussed here.
The solution is quite simple (though the instructions kept me trying for a while). Sudo to root, edit the file /etc/X11/xorg.conf, add following line to Device section:
Option "MigrationHeuristic" "greedy"
Then logout, and re-login, solved!
Tuesday, June 2, 2009
Bird nest
I spotted a bird nest today, right beside a trail on my daily jogging route!
From the second photo, you can see it's on open ground and very close to the trail, where runners and bikers pass by, and sometimes even cars (luckily very few). I'm really worried that this nest would be easily destroyed, either by people, or by squirrels! I placed some larger rocks around the nest, hope it will do some protection, or at least let others notice it!
When I was there, the nest owner was around me all the time, singing angrily. I really don't know which kind of bird it is.
So interesting...
Subscribe to:
Posts (Atom)