/ TECH

Automating the Newspaper and Magazine download!



This morning [October 3, 2018 GMT +545] we had this amazing discussion in dev.to twitter chat.

Well there were many of the surprising automation stuffs and idea and here’s one i posted in reply.


The Magazines and Newspaper below listed are local media and may not be suitable for your geographical region or area. So please be advised.


Behind this Idea:
The basic idea behind this automation of downloading the newspaper is I read local newspaper often as i watch news in TV rarely. Mostly i have been an internet guy so all i had is browse the epaper possible. The Media house’s online portal gives access to download the Newspaper and Magazines in PDF version as well [ which i am developing now ] so i came up with this automation idea. This is very basic scripting scenario which i wrote almost a year and half back. So, let’s get to work.



Scenario:

  • 2 version of newspaper, English and Nepali published daily.
  • 1 tabloid published weekly in Nepali. [ every Friday! ]
  • 1 magazine published weekly in Nepali. [ every Sunday! ]

[For those who may not know, Nepali is national language spoken here in Nepal which lies in the south Asia, known as Land of Buddha and Mount. Everest.]

Working Stuff

First thing we can do is checking the system date and getting the year, month, day and week day which can be achieved in bash like this.

#checking the date with the system
y=$(date +%Y)
#prints the year in full format. Eg: 2017
m=$(date +%m)
#prints the month. Eg: 04
d=$(date +%d)
#prints the day. Eg: 12
w=$(date +%u)
#prints the week day. Eg: 3=wednesday

The variables used here will be useful while using with wget.

The script first checks the main directory and if the directory is not present there, it creates the folder. Then it enters into the folder. The script uses wget for downloading the paper, make sure it is installed already in the system or we could check and install if not present. But during this time, i am skipping this part.

The next step is we create a separate folder for each newspaper and magazines such that script automatically downloads epaper inside them. Skipping all the rest explanation, the major task it would do is check the folder and create if not present and download the available epaper as the script runs.

The major command we can do while downloading epaper looks like this.

wget -O <name-of-newspaper-or-magazine>-$y-$m-$d.pdf "http://epaper.ekantipur.com/epaper/<name-of-newspaper-or-magazine>/$y-$m-$d/$y-$m-$d.pdf"

Although the command block is self explaining, a little explanation.

The epaper version of the newspaper and magazines would be like this [2018-10-03.pdf] for today so we used the variable from the first step to replace in the download link for the day. The week day is extracted to check for the weekly magazines considering Monday as 1 as the tabloid is published every friday and the magazine every sunday.

The next part will be setting up a CRON JOB where we will tell the script to run every 6 or 7 AM. So we setup the cronjob like this:

crontab -e

Then insert like

0 6 * * * /home/username/script/newspaper-to-pdf.sh where the path is your script location.

Additionally we can customize in the cron to simplify your job need. But in my case, i disabled the email output. That’s all for today!

Here’s the full script.


cdrrazan

Rajan Bhattarai

Software Engineer by work. Full Stack Ruby on Rails Developer. DevOps and Blockchain.Tech Blogger. Inquiries and Articles: inbox@cdrrazan.com S/ #crn

Read More