Today I needed to pull some web page down from the internet and extract some specific contents in PHP. Sounds like a crawler, huh? Actually not the real crawler, just pulling our own contents. I was doing this because it’s not convenient for me to access the database directly.
I’m not quite familiar with PHP, but with version 5 on my local dev machine, I was able to do this very quickly. Just use file_get_contents to get the whole page as a string, and then use preg_match_all to search for the parts I want.
Unexpected things happened after I uploaded the script to the server. It said function file_get_contents was not defined. Then I realized that I was on a machine with Red Hat 9, the PHP I was using was version 4.2.2 bundled with RH9. OK. I rewrote the code to use fopen/fread directly. This time, it complained that it couldn’t handle the scheme (I don’t remember the error report string clearly).
I don’t know if it was because of my configuration, or version 4.2.2 doesn’t support the wrappers. It made me crazy. I don’t want to do an upgrade because all the packages are old. It takes time and may cause more problems. I even couldn’t find the apxs binary to compile PHP from source.
Finally, I got a workaround. First use exec to call wget to download the url to a file in /tmp, and then use fopen/fread to read this temp file. It really works.
Another problem was that preg_match_all doesn’t accept the last $offset parameter in PHP 4.2.2, but it’s simple to fix, I think.
This took me some time, but made me realize that how the development of software/language tools eased our daily work.
Posted in Development | No Comments »
One month ago, I became interested in Django and made studying Python well a goal for myself.
Yes I know there are other ways to study a language. For example, learn Python by practicing with Django. But I want to be a bit familiar with Python before coding Django websites. So I decided to implement the algorithms in the famous book “Introduction to Algorithms“. The even greater benefit for me, I thought, was that I could get more familiar with algorithms.
It’s a great plan for me, one without great determination. Some friends said it’s hard when I told them. Now the fact turns out to be I really can’t go on with it. At least it must be paused, if not terminated.
I just got a new job. Although I really love it, I’m overwhelmed by the amount of new tools and knowledge I must learn. The good news is that I will learn Python for this job. The bad is that I’m afraid I can’t learn Python through implementing the famous algorithms. I must learn fast through practicing in real productions work.
So to learn Python is easy. But the road I chose to this goal is hard. Maybe it’ll return great profits - reading the book helped me a lot in the interview for this great job.
Will I resume the process when time is not so expensive as now? I wish.
Posted in General | No Comments »
Recently I noticed that one of my Greasemonkey script called “Google Reader Unread Count in Gmail” was displaying the wrong count. I was too busy but at least two users urged me to solve the problem so I took a look into the issue, which turned out to be very simple and easy to fix.
In fact Google changed the xml schema of the output so my script was displaying a timestamp. That’s why the count look so strange
Now the source code on userscripts.org is updated(here’s the diff). Please update your local version if you’re facing the same problem.
Posted in JavaScript | No Comments »
A very common purpose of Greasemonkey scripts is modifying the DOM structure of the document, mostly adding something new.
I’ve written several scripts before and have been doing such modifications in the “load” event handler, as in this script. The problem is that “load” event is fired after all images on the page have been downloaded completely, so users may suffer a great delay to see the changes occur if there’re many images.
jQuery users know that jQuery has a function “ready”. It won’t wait for the images to load. We are writing Greasemonkey scripts, so we only care about Firefox. Firefox has a “DOMContentLoaded” event explained in detail here.
Fired on a Window object when a document’s DOM content is finished loaded, but unlike “load”, does not wait till all images are loaded. Used for example by GreaseMonkey to sneak in to alter pages before they are displayed.
So when writing my last script I tried to rely on this event:
window.addEventListener('DOMContentLoaded', function(e) {
...
}, false);
But the codes in the handler were not executed.
I was confused and did some searching. Finally I got this page saying that “the code in a Greasemonkey user script gets invoked when the DOMContentLoaded event fires”. That explains all. The “DOMContentLoaded” event was already fired so the codes never had a chance to get executed.
The solution, of course, is pulling the handler codes out of the wrapper. If your script doesn’t really rely on the “load” event, don’t put them into its handler because there may be a obvious delay.
Posted in JavaScript | No Comments »
Two weeks ago we went to Mount Huang (Huangshan in Chinese) and spent 3 days there (2 days on the mountains). The trip was wonderful and Mount Huang is really impressive.
During the shot trip we saw all kinds of beautiful sceneries like sea of clouds, sunrise, sunset, various strangely-shaped pines, peculiarly-shaped granite peaks, etc.
We walked on the dangerous roads hanging on the middle of the cliffs, which was very exciting. You’d better not look downwards, and pray the road is solid enough.

Below is the carved steps to Celestial Peak. It’s like a ladder stretching into the fog. It’s almost vertical but don’t be too frightened, as there’re ropes beside it for travellers to grab.

Sea of clouds appears in the morning. As you can see in the pictures, our standing ground is higher than the clouds. The picture on the right is a scenery called “monkey watching the sea”.

If you’re interested, view more photos here: http://www.flickr.com/photos/qingbo/sets/72157609353527815/
Posted in Travel | No Comments »
These photos are taken after ending ceremony of the 2008 Summer Olympics. The ending ceremony was not very good, in contrast to the opening ceremonty
Except the fireworks. After the ending ceremony, Beijing was illuminated by fireworks everywhere in the sky:


Taking photos from my room is difficult because it’s not at the right position. Tripod is of no use because the camera must be projected outside the window. Besides, my camera (Canon A510) is not good enough.
Posted in Life | No Comments »