Annotating PDF with Okular

Once in a while I am looking around if there is finally a way to properly annotate PDF in Linux. The answer was no until a couple of months ago. But I think it is still little known.

Even in this post, whose comments made me have a close look again, did see the option of embedding annotations into PDF. The comments, however, point to Okular which is a very good reader since quite some time, and a more or less recent version of poppler the PDF library.

The way to go is to make annotations with Okular (use the review tool (F6)) and then save the PDF with “save as”. Now the annotations are embedded into the pdf file. I tested the annotations with the Adobe Android reader and I can view them and alter them with it.

Unfortunately this information is hidden in the Okular handbook and not really updated on the Okular page. Anyway I hope the Okular team is thinking of making the process a bit less tiresome by providing an option to save a PDF straight away.

I tested it with Okular 0.17 and poppler 0.22, both packages of fedora 19.

[edit] Just now as I wrote this article for the first time Okular produced a popup at the first annotation today, explaining what I wrote here. It might be appearing a bit randomly, though.

Filtering STDERR

I got fedup today with a Gstreamer warning that clutters my output, so that I cannot see my own debugging print outs not anymore.

GStreamer-WARNING **: gstpad.c:3923:gst_pad_push_data: Got data flow before segment event

I tried to set GST_DEBUG=0 but it did not help. What did help though, was a simple intervention on the bash side of things:

python cameraPipelineSwapTest.py 3>&1 1>&2 2>&3 3>&- | grep DEBUG

This filters all lines in STDERR for DEBUG which I print out with the python logging module and throws everything else away. So all other warning are suppressed. (Of course I wont see any gstreamer error either, but in this case I do not really care)

A Practical Guide towards Performance in Research Code (in Python)

I gave yesterday a short tutorial about writing python code for scientific research. There I compared different methods of speeding up a python function and do a small benchmark at the end. I also covered the use of ipythons parallel modules and using QThread to thread an scientific program.

http://nbviewer.ipython.org/5774959

Have a look, while I am looking forward to your comments.

How to make recorded/saved webcam H264 stream playable in non-gstreamer players

Just a small update to my post earlier about capturing H264 from a webcam. The files that I get from that are not necessarily playable with any player that does not rely on the gstreamer framework. (Gstreamer based applications can play the files). If you want to use the videos somehow else, you need to add some missing metadata (I think what is missing is the total number of frames). For me

 ffmpeg -i damaged.mp4 -acodec copy -vcodec copy fixed.mp4

Does the trick. Now I can even import the files in powerpoint…

The process is very fast and I just batch over my recorded files…

(thx to http://ffmpeg-users.933282.n4.nabble.com/repair-mp4-file-td1009312.html)

IPython in virtualenv

I struggled hard today to get IPython working in a virtual env. Better I stuggled hard to get the IPython qtconsole working in there. This is just a summary of how I installed the most difficult packages (everything else can be installed via pip, or as written on the IPython page via: easy_install ipython[zmq,qtconsole,notebook,test])

First I followed this receipt from stackoverflow
http://stackoverflow.com/questions/1961997/is-it-possible-to-add-pyqt4-pyside-packages-on-a-virtualenv-sandbox

$ workon myProject
$ pip install SIP
$ pip install PyQt
$ cd ~/.virtualenvs/myProject/build/SIP
$ python configure.py
$ make
$ make install
$ cd ~/.virtualenvs/myProject/build/PyQt
$ python configure.py
$ make
$ make install
$ cd && rm -rf ~/.virtualenvs/myProject/build # Optional.

With the alteration of

python configure.py --qmake /usr/bin/qmake-qt4

(http://stackoverflow.com/questions/6906856/error-installing-pyqt)

Then I still could not start IPython qtconsole. After a bit of frustration I decided to give the head of the git repository a try. That worked at the end:
pip install https://github.com/ipython/ipython/archive/master.zip

Now I just have to install the rest…

P.S.: If someone knows how to configure Ipython qtconsole/notebook kernel in a similar way as described here for normal IPython, let me know. That script only did not change the behaviour of Ipython qtconsole/notebook, even though it is executed…

Working with simple RAID1 — fast setup/ use of single disks of former array

Creating simple RAID1

https://wiki.archlinux.org/index.php/RAID

Search for disk
# fdisk -l

Format disk to Linux RAID autodetect (FD)
# cfdisk /dev/path_to_disk

Create RAID array
# mdadm --create --verbose /dev/md/your_array --level=1 --metadata=1.2 --name="description" --raid-devices=2 /dev/path_to_array_disk-1 /dev/path_to_array_disk-2

Write to mdconfig file
# mdadm --examine --scan > /etc/mdadm.conf

Assemble RAID array
# mdadm --assemble --scan

Format array drive
# cfdisk /dev/md/your_array
# mkfs.ext3 -b 4096 /dev/md/your_array

Recover data from single drive of RAID1 array

http://blog.sleeplessbeastie.eu/2012/05/08/how-to-mount-software-raid1-member-using-mdadm/

I needed first to stop the RAID array, even though I installed the drive physically in another computer.
Check array location
# cat /proc/mdstat
mdadm --stop /dev/mdXXX

Then I followed the guide linked above:
mdadm -A -R /dev/md0 /dev/sdb1
Which is the same as
mdadm --assemble --run /dev/md0 /dev/sdb1

Then the drive is mountable.
mount --read-only /dev/md0 /mnt/raid/

Link

How to get complete list of ogg files of librivox books on archive.org

If you are going to do some brainless labour in a wifi-free environment, you need to get prepared. For instance with an audio book from librivox. Rather than clicking through each chapter in the browser, wget gets it done a bit easier:

wget -r -l1 -H -t1 -nd -N -np -A.ogg -erobots=off https://ia600403.us.archive.org/20/items/aliceinwonderland_1102_librivox/

Published code to “Gstreamer: Stream H264 webcam data to series of files”

I just put the (memory leaky) code to my previous post about saving an H264 stream into a series of files  on github. Have a look, try it out and find the bug 😉

https://github.com/groakat/chunkyH264

Read the readme about the bugs and usage. Let me know if you find more errors.

Combine repeated PDF into a single PDF

I needed to print today some copies of a PDF via CUPS (Common UNIX Printing System). The problem is that if I use the command line option -# (print a number of copies), the printer tends to print first all copies of the first page, then all copies of the second page and so forth. To avoid having to sort the papers after printing, I am combining them first into a single PDF using LaTeX:

\documentclass[10pt,a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\usepackage[left=0cm,right=0cm,top=0cm,bottom=0cm]{geometry}
\usepackage{pdfpages}
\usepackage{forloop}
\begin{document}
\newcounter{themenumber}
\forloop{themenumber}{0}{\value{themenumber} < 60}{
	\centering
	\includepdf{myDocument.pdf}
}
\end{document}

Note: \forloop{themenumber}{0}{\value{themenumber} < 60}{ might be not properly rendered above. Click on view source to get the right it right.

That command has the disadvantage, that much more data gets send to the printer, but at least it works.