show word occurence in pdfs on commandline

Do you want to know how often words occur in a pdf file? And sort them by the most occurring word:

pdftotext mypdf.pdf - | sed "s/[[:cntrl:][:digit:][:punct:]]//g" | tr '[:space:]' '[\n*]' | sort | uniq -c | sort -bnr

Let’s break it down step by step:

pdftotext mypdf.pdf -

displays the pdf content on the command-line


sed "s/[[:cntrl:][:digit:][:punct:]]//g"

replaces all control characters (cntrl), all numbers (digits) and all punctuation characters (punct) with an empty string.
See here for character classes.

tr '[:space:]' '[\n*]'

replaces all spaces with a newline


sort | uniq -c | sort -bnr

The last part sorts the output, groups unique lines and prefix them with the amount and finally sort them again
with ignored leading blanks (-b), sort numeric (-n), in reverse order (-r)

php packages installation

Let’s have a look which packages are already installed and show only those with „php“ in the name. A fresh system usually shouldn’t give any results.

dpkg --get-selections | grep php

Next you can check what kind of modules are available. Do it like this:

sudo apt-cache search --names-only php7

Now install the php7 base packages with the command:

sudo apt-get install -y php7.0

Now install some other packages for php:

sudo apt-get install -y libapache2-mod-php7.0 php7.0-curl php7.0-xml php7.0-mysql php7.0-mcrypt php7.0-intl php7.0-zip php7.0-bz2