How To: Download with Newsgroups

So you've built a nifty file server running Linux after following your favorite blogger's series of DIY 200 Dollar PC articles (Part 1, Part 2, Part 3). Now what? Wouldn't it be nice if you could turn that server into a speedy downloading machine?

Take this scenario as an example: It's 8pm and you just got out of work from Google, Facebook or whatever your favorite "it" company is these days. You're waiting at the Caltrain station playing around with your iPhone until the train arrives to scuttle you back to San Francisco. You realize you forgot about something you wanted to download and send an email to your server from your iPhone. By the time you get home the download has finished and is waiting for you.

That can be done and this article aims to get you hooked up with such a usenet service provider setup. For the purposes of this article, newsgroups and usenet are used pretty much interchangeably.

Disclaimer: By writing this article I do not endorse, encourage or otherwise support piracy, nor can I be held liable in the event someone should follow this guide to pirate. Like BitTorrent, Usenet and newsgroups have legitimate uses.

Newsgroups... Usenet... What is that?

Wikipedia states that Usenet is a global, decentralized, distributed Internet discussion system. Within Usenet, there are newsgroups for different types of threaded discussion, typically accessed through newsreader software. However, discussions aren't the only things taking place on Usenet newsgroups. The alt.* hierarchy of newsgroups, for example, is home to binaries newsgroups where people may upload and share various types of files and media.

Unlike BitTorrent, a popular peer-2-peer communications protocol, interacting with newsgroups only requires downloading from a server rather than downloading from peers and uploading to peers. As such, downloading through newsgroups is typically extremely fast, given a proper Usenet service provider. Also, you are only connecting to your Usenet service provider — not any peers or other computers — making Usenet much more secure.

Just to give you an idea of the speed, if you had a large queue of files to download over a typical 6 megabit Cable Internet connection, you would max out the line and exceed 2 Terabytes of bandwidth consumption in a single month. If you lived on a college campus with a fat pipe (assuming fat = 60+ Mbps), that number would be closer to 20 Terabytes. Of course, I doubt many people would realistically download 24/7 for an entire month.

For example, try this newsgroup download speed test hosted by Giganews to see what kind of speeds you can get. Note that these would be the same as actual download speeds.

Newsgroup Download Speed Test by Giganews
(my Internet connection is fiber-based)

Attaining a Usenet service provider.

To use newsgroups, you will need a Usenet service provider (NNTP - Network News Transfer Protocol). There are many options to choose from. Some are free while most are only for pay with strict download limits unless you opt for an unlimited package. After a quick search online, I found that Giganews Usenet is one of the most popular providers and with good reason. Giganews hosts over 100k newsgroups (almost all of them), is a fast and reliable Tier 1 Usenet provider and has a lengthy binary retention.

Binary retention is probably the most important thing to take note of when looking for a usenet service provider; it is the length of time files stay on their servers since the upload. The shorter the retention, the less of a chance you have of finding the files you're looking for. After checking out a handful of Usenet providers, I landed on Giganews, which offers an industry leading 500+ days of binary retention!

I went with Giganews's Diamond account as I wanted an encrypted 256-bit SSL connection in addition to an unlimited account. Update 12-26-2009: Giganews now has VPN service called VyprVPN bundled with their Diamond accounts that lets you encrypt all of your computer's web traffic.

You'll need a client.

In order to download with newsgroups, you'll need to setup a client on your computer or file server. Again, there are a lot of options to choose from. Here are some of the best, by OS.

Mac OS X - Panic's Unison (25). You can't really compete with this app for OS X. It won an Apple Design award and comes from the developers of Transmit and Coda. You can also use my Linux favorite hellanzb in OS X but installation is more involved with DarwinPorts. Update: Zach Inglis pointed me towards NZB Drop.

Windows - GrabIt (free), NewsLeecher (20/yr), Android (27)

Linux - I have tried many GUI-based newsreaders in Linux and they all fell short of my expectations. If you have any suggestions, please let me know in the comments, but I am very happy with the Python, command-line-based hellanzb. There is also URD (Usenet Resource Downloader) for those that need a GUI.

I keep hearing about *.nzb files. What are they?

An nzb file is an XML-parsed text file containing metadata and links to Usenet postings. It was created by the popular Usenet index Newzbin.com. Usenet downloading clients use the nzb file to find out what needs to be downloaded from Usenet servers.

The easiest way to think about nzb files is with this analogy: .torrent files are to BitTorrent as .nzb files are to Usenet. The files contain no actual files, only information about where and from whom to download the files. When going about downloading something with Usenet newsgroups, you typically want to look for and download the nzb file and then load that into your download client.

Tweaking Hellanzb

Since I imagine many of you will want to take advantage of newsgroup downloading on your new Linux file server, I will discuss my favorite Linux newsgroup client - hellanzb. It might be command-line based but it is extremely capable and can be run as a handy daemon. Hellanzb has a strong community behind it that have created numerous related applications including adding a web UI and Firefox integration for smoother nzb queuing.
Hellanzb Status
How can you say no to ASCII art?

SSH into your Linux box or work directly on it and install hellanzb and it's required parts.

sudo apt-get install par2 rar hellanzb

Hellanzb needs python but you most likely already have that installed if you're using a distro like Ubuntu. Next up, you'll need to configure hellanzb.

sudo vi /etc/hellanzb.conf

In the config file, you will need to go through and change the following items:

  • host - For example, it would be news.giganews.com:119 for non-SSL, news.giganews.com:563 for SSL
  • username & password for your Usenet service provider
  • connections - The maximum simultaneous connections to your Usenet service provider's servers. This depends on your account and can range from something like 5 to 20 max simultaneous connections. If you're not sure, check with your provider.
  • ssl - Set to "True" if your account supports SSL encryption, leave "False" otherwise.
  • Hellanzb.PREFIX_DIR - Where you want the hellanzb directories to remain. I wanted them on my larger, secondary drive and I removed the dot from the hellanzb folder name so they would be visible when browsing from networked computers (OS X likes to hide .folders by default).
  • Hellanzb.UNRAR_CMD = '/usr/bin/rar'
  • Hellanzb.PAR2_CMD = '/usr/bin/par2'
  • Hellanzb.SKIP_UNRAR - Set to False to have it automatically extract completed downloads.
  • Hellanzb.NEWZBIN_USERNAME & Hellanzb.NEWZBIN_PASSWORD - Provide login info for your Newzbin.com account so hellanzb can automatically download NZB files given the NZBID.

Don't forget to save and quit vim (:wq). Once hellanzb is installed and configured, you can start it up by simply running it.

hellanzb

Starting hellanzb

That is the standard way of running hellanzb. It gives you live download stats and monitors the queue (.hellanzb/nzb/daemon.queue) for new nzb files to download. Just drag a file in the queue directory to start a download. However, you must keep the command line open all the time or the process will quit. This is why I prefer running hellanzb as a daemon process:

hellanzb -D

A daemon process runs in the background when working. It's always at your beck and call. Interacting with hellanzb as a daemon is a bit different. To get download stats, you must manually call them with the status option.

hellanzb status

You can cancel downloads with the cancel option and so on. Read the man page for more.

Remote administration

The whole point of this article is not only to get started with downloading over newsgroups but to get it to a point where you can start downloads away from the computer. There are multiple methods of doing this but naturally I wanted to go the road less taken and build something.

Alternatives include utilizing a web UI such as hellahella but to be able to access the UI outside of my network I would have to open up ports on my router, potentially risk my network's security and all of that. Another option is opening up an SSH port on the router and then SSHing into the server to manage downloads. It's a powerful option but it brings with it the hassle of setting up DynDNS for a dynamic IP and forwarding the SSH port to the file server, which itself changes when devices become unplugged/plugged in the network and DHCP reassigns recently used IPs. In a nutshell, no thanks to all of that.

Custom email to hellanzb python/bash scripts you say? Respect.

I realized the best solution for me would be emailing my file server somehow. Initially, I thought I would send it the nzb file but that's too much hassle for the user, brings in device constraints (try downloading and emailing an nzb file on your iPhone) and leads to a more complicated script. I decided it would be best to create a special email account and send it emails with the NZB ID (typically a 7 digit number) of the file, or the URL itself, I wanted to download in the email body.

After many hours wasted trying to work with various mail fetching scripts and not getting what I wanted, I enlisted the help of hacker extra-ordinaire Mike Wozniak. Mike is a computer engineering major at Georgia Tech that first got me into Linux when he lived down the hall my freshman year, but I digress. He came up with 2 scripts.

Download the scripts here: hellanzb_scripts.zip (4kB)

Place both scripts in /usr/bin/. Edit hellagmail.py to include login credentials of the email account you wish to tie to your file server/download box. Look for the info around line 170. I highly recommend setting up an entirely new account only for this purpose. The file is named after Gmail as it makes use of POP3 SSL and the Gmail POP server. I believe it can be used with any email account by removing the _SSL part around line 111-112 in the try block and changing the POP server URL.

When selecting an email name for this, you must never share it or others might send it NZB IDs to download without your knowledge. As such, mine is more of a captcha than a real word. Also POP3 access must be enabled in the Gmail account.

Hellanzb email script

Save hellagmail.py when you're done and make it executable with the line below.

sudo chmod +x /usr/bin/hellagmail.py

Do the same with hellascript.sh.

sudo chmod +x /usr/bin/hellascript.sh

What do the scripts do?

The hellagmail.py file checks a Gmail account via POP3, reads email messages and outputs the numbers it finds in the email body. As such, send otherwise empty emails - a number in your signature could confuse hellanzb. An example email body might be "1234567 1235678" and those two numbers would be queued by hellanzb.

The bash script, hellascript.sh, checks to see if the hellanzb daemon is running and then queues the NZB IDs from the email using the enqueuenewzbin option. This will only work if you gave hellanzb access to your Newzbin.com account in the config file.

Cron job'n it!

These scripts need to be run every so often to check your email account so I used crontab to manage hellascript.sh as a cron job set to be run every 5 minutes.

Edit crontab with this line.

crontab -e

Then give crontab the time information it needs as well as the path of the script to run.

*/5 * * * * /usr/bin/hellascript.sh

Hellascript Cron Job in Crontab

If it looks like that, you can go ahead and save. Fire up the hellanzb daemon (hellanzb -D), browse Newzbin.com, find a download you like and send the NZB ID in an email to your new email account. Within 5 minutes or less your computer should begin downloading the files after looking up the NZB ID on Newzbin.com and downloading the nzb file. When the download is done, hellanzb will check the parity of the files with par2, repairing as necessary, and then unrar the files (as per the settings in /etc/hellanzb.conf).

Thoughts

If you download shared files online, do you use Usenet, BitTorrent or something else? Will you consider migrating over to Usenet? What do you think of the whole "emailing your file server from your phone to download you something" concept? If you found this article useful, please leave a comment on the way out and tell your friends.