This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new Linux.com!

Linux.com

Feature: System Administration

How to save time and traffic upgrading with apt-proxy

By Nathan Willis on June 26, 2008 (9:00:00 AM)

Share    Print    Comments   

June is Bandwidth Conservation Month (well, not officially, but let's say that it is), so if you have multiple machines running an APT-powered Linux distribution such as Debian or Ubuntu, you should take a look at apt-proxy, a utility that caches package downloads in a shared pool for all interested parties on your LAN. This saves you both the time and the bandwidth it costs to download the same updates for more than one computer.

I was blissfully unaware of apt-proxy myself last week when I was upgrading three Ubuntu machines from 7.10 to 8.04, but the idea of caching the downloaded packages seemed so obvious that I was sure such a utility existed. After familiarizing myself with it I decided to test it out on the next batch of package updates that came down the wire.

Nothing about apt-proxy is Ubuntu-specific; it works equally well for any APT repository. Nevertheless, if you are trying it for the first time, it would be wise to look for distro-specific instructions or success stories from other users. And if you don't get apt-proxy to work, you can always roll back the changes and update your system the way you used to.

An apt-proxy setup involves running the apt-proxy service itself on one machine that in turn acts as an APT repository for the others. For those others, no new software need be installed -- adding the apt-proxy server to their respective /etc/apt/sources.list is the only configuration required.

In sources.list, the entry for a typical repository takes the form deb http://some.server.org/and/optional/path distribution component1 component2 ... componentN -- the distribution element (e.g. gutsy) distinguishes between multiple distros or distro releases on the same server, and the component elements refer to discrete collections of packages (such as main, nonfree, or security-updates) available for the specified distro. Luckily, those are all server-side APT issues, and apt-proxy does not have to worry about them.

But apt-proxy will have to find all of the requested packages, so you will have to jot down all of the URLs that make up the second element in each entry so that you can add them to apt-proxy's config file. If all of your computers run the same version of the same distro, it will be a short list, but make note of the repositories you need before you begin.

To get started, install the apt-proxy package on the computer you intend to use as a server with sudo apt-get install apt-proxy. Once it is installed, edit the file /etc/apt-proxy/apt-proxy-v2.conf. The default configuration is likely to work fine for most users, but familiarize yourself with the details at the start of the file just to be on the safe side.

The port setting (default 9999) is important; it is the TCP port on which apt-proxy will listen for connections. If you run a firewall, make sure this port is open and no other services are running on it. Likewise, cache_dir is set to /var/cache/apt-proxy, so you will want to ensure that that directory has sufficient space, particularly if you are upgrading an entire distro.

Configuring your repositories

The latter half of the configuration file lists the APT servers that apt-proxy itself will connect to. This is the portion of the file you will definitely want to edit.

Each APT repository accessed by any of your client machines is recorded here as what the in-line comments call a "backend server." Each backend server gets its own section, beginning with a name enclosed in [brackets]. You get to choose the name, so choosing a short and descriptive one can help if you use a lot of repositories.

You can list multiple alternate URLs in each backend server section to allow failover in case the main repository is unreachable. Don't confuse the use of multiple URLs for each backend server with the fact that you can have multiple backend server sections. The alternate URLs within one bracketed backend server section must all be sources for the same set of packages. Apt-proxy tries the first URL in each section first, and looks at the others only if the first fails.

For example, if you use the main Debian repository and Google's APT repository, you will need a [debian] section and a [google] section. The [debian] section can list both the main URL (i.e., http://ftp.us.debian.org/debian) and an alternate (say, ftp://ftp.uk.debian.org/debian).

Notice also that you only list the URL for each repository, and not the distribution or components that are also included in a sources.list entry. Apt-proxy does not care about them; it will fetch whatever the apt-get clients request on a case-by-case basis.

Finally, start apt-proxy on the server machine with sudo /etc/init.d/apt-proxy start.

Configuring your clients

On every machine that you want to send through apt-proxy, edit /etc/apt/sources.list. For each repository entry, replace the original server's URL with the address of your apt-proxy server (including the port number on which it runs), appended with the name of the corresponding backend server entry you previously set up. Leave the distribution, components, and the "deb" unchanged.

For example, assume that your apt-proxy server is running on port 9999 on 192.168.1.101. If the client's sources.list contained the line deb http://dl.google.com/linux/deb/ stable non-free for Google's repository, you would substitute deb http://192.168.1.101:9999/google stable non-free.

Over on the apt-proxy server, the apt-proxy-v2.conf file should have a backend server section labeled [google] that contains the original URL, http://dl.google.com/linux/deb/. When the client requests a package, the apt-proxy server intercepts the request, looks to see if has already cached the package, and if it has not, sends the request through to the upstream APT repository.

When you update the first client machine, apt-proxy will have to fetch its packages from the Internet as usual, so it won't result in any speed-up. But for every subsequent machine making the same update, apt-proxy will serve up the requested packages at LAN speed -- sparing you time and bandwidth usage.

The only question you'll have will be "why didn't I do this sooner?" But you'll spend so little time doing your updates that you'll hardly have a chance to mull it over.

Share    Print    Comments   

Comments

on How to save time and traffic upgrading with apt-proxy

Note: Comments are owned by the poster. We are not responsible for their content.

How to save time and traffic upgrading with apt-proxy

Posted by: Anonymous [ip: 58.107.243.71] on June 26, 2008 11:04 AM
There is a much simpler program available called "apt-cacher". See http://ubuntu-tutorials.com/2007/01/08/save-bandwidth-during-updates-with-apt-cacher-ubuntu-610/ for a good setup tutorial. Once it is setup in a server, all that is needed to activate it in a client is a single line added to the /etc/apt/apt.conf file: Acquire::http::Proxy “http://apt-cache-machine:3142″;

Another program worth looking at is apt-zeroconf which can actually download packages from other computers on the network if they are turned on!

#

How to save time and traffic upgrading with apt-proxy

Posted by: Sajith T S on June 26, 2008 11:09 AM
There's approx and apt-cacher also. Nice thing about apt-proxy and apt-cacher is the capability to "import" existing packages to their cache.

#

How to save time and traffic upgrading with apt-proxy

Posted by: Colin Dean on June 26, 2008 01:44 PM
I implemented apt-cacher at my workplace, where we had six Ubuntu machines pulling daily updates separately on a 1.5 Mbps T1 line. Occasionally, we'd install on a VM and need to update, adding several hundred megabytes to the load. This ground the network to a halt for hours. I put apt-cacher on one of the machines and distributed a command line to add the proxy file. We now have a machine which is displaying "tail -f /var/log/apt-cacher/access.log" so we can watch the fun every couple of hours.

#

Good writing

Posted by: Anonymous [ip: 75.145.180.221] on June 26, 2008 02:12 PM
Good article--very well written. Thanks.

#

How to save time and traffic upgrading with apt-proxy

Posted by: Anonymous [ip: 12.169.163.241] on June 26, 2008 08:21 PM
Thank you, nice article about a useful application.

#

Old and crashes

Posted by: Anonymous [ip: 127.0.0.1] on June 26, 2008 10:29 PM
apt-proxy needs some serious work. I'm not the first to restart apt-proxy daily from cron, just to try and keep it going. I'm about to look into approx and apt-cacher as replacements.

#

Re: Old and crashes

Posted by: Anonymous [ip: 71.185.208.232] on June 27, 2008 12:03 AM
Yeah, I strongly agree. apt-proxy is unstable and needs to go away. I used it for several years very happily, until the bit-rot got so bad that even the daily cron restarts didn't work. Here are the details on why I picked approx over apt-cacher and I haven't looked back: http://lists.netisland.net/archives/plug/plug-2008-05/msg00038.html

#

How to save time and traffic upgrading with apt-proxy

Posted by: Anonymous [ip: 66.211.58.210] on June 27, 2008 02:58 PM
i think apt-cacher is probably the best solution out there because it is just seen as a regular http proxy and does everything else behind the scenes.

#

apt-proxy is a great idea

Posted by: Anonymous [ip: 64.104.252.130] on June 28, 2008 01:56 PM
But apt-cacher is a great implementation of that great idea.

Thoughts of using the shell script based apt-proxy still gives me nightmares

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya