Tuesday, June 16, 2009

To Lake Livingston

I watched a swimming competition for school children in the town Huntville, about 40m east of College Station.




A cool hat.



After that, we went to Lake Livingston to have fun.

First time to see a team of Harley motorcycle-riders, a little bit astonished...



The lakeview was occupied with ``private properties'' that we could not even walk on the shore. But finally we found a small public park to enjoy. A lot of people were camping there. The lake water is not very clear, but still many people were playing in it.

Monday, June 15, 2009

Never a dragon

You westerners please see this picture:


which comes from http://www.loong.cn

which is: the symbol of Chinese culture is never a nasty dragon, but the Loong! We are working to correct the mistake!!

librdf Python API summary

I reviewed my code written using the Redland librdf Python API, and made a brief summary as a memoir. For more advanced and powerful parsing, I'm turning to some Java libraries, such as Jena, owlapi, Pellet, ...

** Start doing the thing:

import RDF

in memory model
model = RDF.Model(RDF.MemoryStorage())

Berkeley DB model
model = RDF.Model(RDF.HashStorage(bdb_location,options="hash-type='bdb'"))

p = RDF.Parser('raptor')

f_uri = RDF.Uri(string='file:/path/to/rdf_file')

p.parse_into_model(model,f_uri)
(can parse multiple files into one model
)



** Simple query (fetch one or multiple components from triple store)

result = model.get_target(a,b)
.get_predicate(a,b)
.get_source(a,b)
returns a RDF.Node object, or None upon failure

results = model.get_targets(a,b)
.get_predicates(a,b)
.get_sources(a,b)
returns a sequence of RDF.Node objects, or None upon failure
can be iterated like list


** Not simple queries
Create a query object:
query = RDF.Query(query_string,query_language='xxx')

Query languages are rdql or sparql, default rdql

Sparql query with new string format syntax:
query = RDF.Query('SELECT ?s WHERE {{ ?s <{0}> <{1}> }}'.format(...),query_language='sparql')

results = query.execute(model)
results is a RDF.QueryResults object, also an iterator

results.finished() will return boolean on whether the result is finished/empty
for this_re in results:
print this_re['s']

this_re['s'] is a RDF.Node object.


Literal RDF.Node can be directly printed, or:
node.literal_value['string']

Friday, June 12, 2009

Generic genome browser on Ubuntu

I'm preparing some stuff for the workshop, so I'm getting back to gbrowse again.

** Installation steps:

gbrowse version: 1.69
Ubuntu version: 9.04)
perl version: 5.10.0
bioperl version: don't know how to figure that out...

$ sudo apt-get install libapache2-mod-perl2
$ sudo apt-get install libapache2-mod-perl2-dev
$ sudo apt-get install libapache2-mod-perl2-doc
$ sudo apt-get install apache2-doc

Verify that this directory exists: /usr/lib/cgi-bin, if not, create.

$ sudo apt-get install libgd2-noxpm-dev
$ sudo apt-get install mysql-server
$ sudo apt-get install mysql-client

Use cpan to install all prerequisite Perl modules as listed in INSTALL.

Started to install gbrowse from source code:

$ perl Makefile.PL

Complained that Bio::Graphics module is old. So upgrade it using cpan. (not successful until graphviz software is installed)

$ cpan
cpan> upgrade Bio::Graphics

$ perl Makefile.PL
$ make
$ sudo make install

Finished. Really a bit surprised to see so many fancy features in this version of gbrowse. The last time I was working with gbrowse is 2006.

** It stuffed some scripts into my system directory. For example those are found in /usr/local/bin:
bp_search2alnblocks bp_search2tribe bp_seqfeature_load.pl
bp_search2alnblocks.pl bp_search2tribe.pl bp_seq_length
bp_search2BSML bp_search_overview bp_seq_length.pl
bp_search2BSML.pl bp_seqconvert bp_seqret
bp_search2gff bp_seqconvert.pl bp_seqret.pl
bp_search2gff.pl bp_seqfeature_delete.pl bp_seqretsplit.pl
bp_search2table bp_seqfeature_gff3.pl
bp_search2table.pl bp_seqfeature_load
... and much more!!

Gbrowse configuration files are located in ``/etc/apache2/gbrowse.conf/''

** use bp_seqfeature_load.pl to initialize the mysql database.

** GFF3 format file nuisance:

The gff3 file for E.coli I downloaded from NCBI was rejected by gbrowse!
The scaffold declaration entry (first row of gff contents) is different with the example gff3 file that comes along with gbrowse program. The TYPE (3rd column) has to be ``chromosome'', and there has to be ID,NAME contents in 9th column.

Monday, June 8, 2009

无题

咚了个咚咚呛!!老乡们,注意了,adug同学的博客....挪窝了!

目前地址是:http://blog.sina.com.cn/adugduzhou

基本上么,虽然还没什么内容,但个人*特色*已初露倪端,所谓``小荷才露尖尖脚,一坨鸟屎糊上头!''

不过还是很羡慕人家的那种满足感,咱什么时候也能那么呢?咱好像从来没那么过...以前听李宁老师说过(不是卖衣服的那个,是克隆牛的那个),人生是艰难的,即便是片刻的满足感也是奢侈的!当时哈哈傻笑来着,现在则是多么的同意他的灼见啊!曾经有过片刻的可供膨胀的空间,却被我当空气般视而不见,哎... 忍了~就让寂寞的空气包裹着我,拖着两条腿继续孤独的前行吧...