Pacman transparent HTTP cache

Hey all,
I'm a fairly new Arch user, and to be honest, a fairly new Linux user in general. Over the years I tried so many times to make the transition from Windows to Linux and after finding Arch, I felt it was finally time, but I digress...
I now run 2 Arch machines here and one Arch virtual machine, but I found myself with a little problem. Having 3 machines meant that it required either 3 times the bandwidth to keep them updated, or a lot of hassle with copying packages around the network. I even experimented with using the Pacman cache directory on an NFS share, but none of these were acceptable to me.
So this afternoon I sat down and coded a solution I was happy with.
I run a small Linux server here, Debian based, which serves as a small file server to the local network and also hosts a few services for myself and a friend. This server also runs Lighttpd which I use for developing with PHP and Perl.
The idea was that this machine would be a mirror for Arch to update from, but I didn't want to mirror everything, just those packages which I used. After much searching through Google I discovered someone who had done something similar for Debian based distributions, apt-cache. It's essentially a small web-server which, when queried for a package, first checks it's local cache, and if it doesn't find the file, it downloads it from an official mirror, both storing it locally and sending it to the client.
I've never coded in Java personally and I didn't want to have 2 web-servers running when one would suffice, so I set about coding something similar in PHP.
The end result is 130 lines of code, and a url.rewrite rule, which achieves exactly what I was after. It works like this:
1) Pacman requests a file from the local server
2) Local server checks to see if it has the file
3a) Local server cannot find the file so it requests it from an Arch mirror
4a) The file is simultaneously downloaded, written to disk and sent to Pacman.
3b) Local server has the file
4b) The file is sent to Pacman
The end result is a transparent cache which will only have to download the file once. An example of the speed increase is as follows:
# pacman -S --downloadonly kernel26
kernel26 21.7M 806.4K/s 00:00:28
# rm /var/cache/pacman/pkg/kernel26-2.6.21.5-1.pkg.tar.gz
# pacman -S --downloadonly kernel26
kernel26 21.7M 8.5M/s 00:00:03
As you can see, after the package was cached on the server, it didn't need re-downloading and as such it transferred to the local machine at LAN speeds.
My mirror entries for the repositories looks like this:
Server = http://192.168.0.1/pacman-cache/pkg/current/os/i686/
Server = http://192.168.0.1/pacman-cache/pkg/extra/os/i686/
etc....
So, my question is this; would anyone out there be interested in the code? Right now it still needs a lot of work before it could be made public as there's very little error checking; I need to handle unexpected conditions like a broken download, and I also have to add handling to deal with the db.tar.gz files being updated, but as and when I feel it's ready would anyone use it?
I'd appreciate any input anyone felt like sharing, even feature requests =^.^=
PS: I hope this is in the right sub-forum... I didn't think it belonged in the actual Pacman forum, but if it did, apologies!

Just a little update
I solved the timeout problem. It wasn't a misconfiguration but rather a bug in the code that was causing it to stall randomly when retrieving a remote package, it'll now happily retrieve multiple packages without a problem.
The following new features have been added since I last posted:
Logging support
It's a little primitive, but it works. The location of the log file is customizable so it should be possible for logrotate to work with it, hence I've not added any rotation system of my own.
Setup support
The cache script now functions correctly when initially installing Arch via the /arch/setup script. I've run it through once or twice, but it could do with further testing.
Testing & Unstable repository support
Self explanatory really
Currently the script only supports the i686 architecture (due to some hard-coded paths), I'd need to do some recoding to support x86_64 as well, it's something I'm considering, assuming I can get VMWare to run a 64bit Arch install. Resuming is still on the "maybe" pile as I'm still trying to come up with a way of coding it cleanly, it's a case of trying to balance effort vs reward on this one.
I'm also currently working on a quick-and-simple administration interface for the cache. It should let you see what files are cached, remove selected ones or an entire repository worth of cache. Perhaps even have the ability to verify the local files against the md5 summary files I build. It's in the early stages right now, but it'll hopefully be complete in the near future.
Overall it works very well and depending on how many bugs I run into when giving it a real test, I should be able to release it here in the not-too-distant future, then anyone who's interested can play with it and perhaps improve it beyond my original design.

Similar Messages

  • IOS SLB transparent web-caches

    Hi all,
    I wanted to ask if anybody has implemented SLB with transparent web-caches, and if yes how was it done?
    can anyone give me a short overview how to implement this? Because I couldn't find enough information on the Cisco site. We are able to implement SLB for various services, but unable to do the same thing with transparent web-caches.
    Any help would be greatly appreciated
    thanks

    HI,
    you can do this for example with a CSS (compare to http://www.cisco.com/en/US/products/hw/contnetw/ps789/products_configuration_example09186a0080093ff3.shtml). This is an example of how to implement transparent caching with a CSS. Same works if you are using a CSM. Required is using a Content Engine.
    Regards,
    Joerg

  • What is the HTTP caching flowchart/logic in Firefox?

    Hello,
    I am configuring a reverse HTTP proxy and I am trying to optimize it as much as possible. I found the following article which was written on Oct. 9, 2002 and it describes the HTTP caching logic in Firefox.
    http://www-archive.mozilla.org/projects/netlib/http/http-caching-faq.html
    However, the article is pretty old and I don't think that Firefox uses the same flowchart in the latest versions of the browser. Do you know how exactly Firefox caches certain object. What I mean is does FF check the Cache-control header first and then the Expires header and finally the Last-Modified. What happens if there are both Cache-control header and Expires header?
    Kind Regards,
    Daniel K.

    WebLogic Server is a single java process, that has two listen ports, one SSL
    and non-ssl.
    These two ports use protocol discrimination to handle multiple protocols on
    a single port.
    NON-SSL --> http, t3 (proprietary rmi protocol), iiop
    SSL --> https, t3s, iiops
    So WebLogic comes with a build in Webserver. Or you can use a third party
    webserver in front of WLS with plugin to proxy to WLS.
    See;
    http://edocs.bea.com/wls/docs81/plugins/
    Cheers
    mbg
    "Manoj" <[email protected]> wrote in message
    news:3edb0ba5$[email protected]..
    >
    Is the built-in web server in weblogic Apache or is it some other httpserver that
    BEA owns ?

  • Pacman ignores writable cache, uses /tmp. Due to cache permissions

    Hi, I had a problem today that wasted a bit of time so I thought it worth posting here. Let me know if I should raise this as a bug.
    I noticed that my pacman cache wasn't being found when doing an update.
    Doing "pacman -Syu" gave a warning "couldn't find or create package cache, using /tmp instead".
    However, I have a cache at /var/cache/pacman/pkg and I have confirmed that it's writeable and I have checked that I have nothing in my /etc/pacman.conf to change from the default cache.
    Even doing "pacman --cachedir /var/cache/pacman/pkg -Syu" didn't help.
    When I used "--debug" I saw this:
    debug: skipping cachedir, no write bits set: /var/cache/pacman/pkg/
    debug: option 'cachedir' = /tmp/
    debug: using cachedir: /tmp/
    warning: couldn't find or create package cache, using /tmp/ instead
    Which caused me to further investigate writing to the directory. All these tests worked - it was there and I could write to it. I then checked the permissions. There were none, it was "d---------". When I changed this to 755 all started working. I don't know what caused its permissions to be wiped.
    Now, the permissions were wrong, I admit, but as root can write anyway it seems rather odd that this stops pacman from useing it. I notice that lipalpm explicitly checks the permissions after a patch was applied in 2011.
    I think it would help if pacman provided the reason why the cache is not used rather than just saying "couldn't find or create package cache".
    I'll raise a bug report if it's worthwhile.

    Scimmia wrote:
    I'm starting to feel like a bot.
    Intel CPU? If so, microcode up to date?
    THANK YOU SO MUCH! It was driving me crazy and I felt like it was either QT or GTK or some library that was messed up... did a full reinstall. Also ran several hours or "memtests" to check hardware (didn't notice the issue until after several reoccurring power outages and feared the worse.) Once again, thank you very much!

  • ABAP HTTP cache refresh

    Hi-
    What role is required for an Integration Server (ABAP) HTTP cache refresh?
    This is accessible if you go to XI Administration -> Cache Overview and click on Full Cache Refresh for INTEGRATIONSERVER_. It calls this URL.
    http://hostname:8000/sap/xi/cache?sap-client=800&mode=F
    XISUPER has all SAP_XI* roles. I get a "403 Forbidden" unless I include SAP_ALL.
    Thanks,
    J Wolff

    Hi,
    Check this document:
    https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/1a69ea11-0d01-0010-fa80-b47a79301290
    Also verify if the user has such roles:
    SAP_XI_ID_SERV_USER
    SAP_XI_IS_SERV_USER
    SAP_XI_IR_SERV_USER
    Regards,
    Wojciech

  • Programatically disabling the HTTP cache in the JVMlt

    How do I programatically disable the HTTP cache in the JVM that is being used by the URLConnection classes?

    Working through the HTTP request properties did the trick, but I actually used the if-modifed-since property, which was more appropriate for me for a number of reasons.
    Edited by: beuchelt on Nov 25, 2008 9:02 PM

  • What HTTP caching headers will prevent iPhoto from re-downloading each image in a "Photo Feed" every time it checks the feed for new content?

    I've noticed a practical issue with iPhoto's File->"Subscribe to Photo Feed..." feature.
    Every time that iPhoto checks a subscribed RSS Photo Feed for new content it will re-download each and every photo in the feed.  This occurs even if iPhoto has already downloaded each image, they all have unique GUIDs set, and no content in the feed has changed since the last time it polled for it.
    My best guess is that this happens if the server offering the feed does not produce the proper cache-control parameters in the HTTP Headers that are sent by the server when the RSS feed is accessed by iPhoto.
    Does anyone know what parameter/value pairs need to be set in the HTTP headers to prevent iPhoto from re-downloading RSS enclosures that it already has?
    This is a very practical problem for feeds with a large number of photos of large file size.  Besides the obvious massive waste of bandwidth, the user receives an annoying error whenever they try to quit iPhoto before the feed and all of it's images have been re-downloaded yet again.

    Here is an example of the HTTP headers for two of the images in the feed.
    Any red flags in there that you think might be causing iPhoto to think it needs to re-download the images after it's already downloaded them once?
    HTTP/1.1 200 OK
    Content-Length: 1251893
    Date: Fri, 27 Jan 2012 15:37:49 GMT
    Server: Apache
    Set-Cookie: PHPSESSID=1b9d47ae170ba8c0a088ab7124a1677e; path=/
    Expires: Mon, 24 Jan 2022 15:37:49 GMT
    Cache-Control: max-age=315360000,public
    Pragma: public
    Last-Modified: Thu, 26 Jan 2012 23:34:38 GMT
    Accept-Ranges: none
    Connection: close
    Content-Type: image/jpeg;
    HTTP/1.1 200 OK
    Content-Length: 744151
    Date: Fri, 27 Jan 2012 15:39:01 GMT
    Server: Apache
    Set-Cookie: PHPSESSID=086c112f99ccecc266a47d66d6b47733; path=/
    Expires: Mon, 24 Jan 2022 15:39:01 GMT
    Cache-Control: max-age=315360000,public
    Pragma: public
    Last-Modified: Thu, 26 Jan 2012 02:54:40 GMT
    Accept-Ranges: none
    Connection: close
    Content-Type: image/jpeg;

  • HTTP Cache control / GZIP compression

    Hi,
    is it possible to add / enable cache control / compression in Connect Pro (7 or 8)? If so, how?
    I suspect we'll have to tweak the underlying http server for it but I'd like to know if this is documented somewhere and if it's supported. Especially, we'd like to
    - add / configure Cache-Control http headers
    - add / configure Expires http headers
    - activate GZIP/Deflate compression
    If this isn't something Connect offers out of the box, any recommendations for proxy servers to put in front of Connect and things to watch out for?
    Thanks,
    Dirk.

    Thanks. We solved it by adding a custom Servlet Filter to the web.xml.
    Dirk.

  • Http cache don't work as expected in an applet

    Hi, everyone!
    I write an applet which dynamic connect a http server and fetch content from it. I wish the content could be cached in browser's cache system.
    Here is my code, it's work well in MS JVM, but when I turn to sun plugin, the cache system don't work. Every time I get a Response code 200 ( wish to be 304)
    ==================My code================================
    URLConnection conn=(URLConnection)url.openConnection();
    conn.setUseCaches(true);
    conn.setAllowUserInteraction(false);
    conn.connect();
    java.io.ObjectInputStream ois=null;
    Object o=null;
    try
    ois=new ObjectInputStream
    (new java.util.zip.InflaterInputStream(conn.getInputStream()));
    return ois.readObject();
    finally
    if(ois != null)
    ois.close();
    ==================My code================================
    I tested on sun JDK 1.42 and IE6.
    My Http Server is Jetty ,and I do have add a last_modified tag in http response.
    I look into the Java plugin control panel ,only the jar file is cached.

    Sorry for the confusion, of course you are right and it seems to be a bug.
    I ment that the method name setUseCaches itself might be confusing because it does not force caching, it just allows it when available.
    There are several bugs about the plugin caching (4912903, 5109018, 6253678, 4845728, ...) even if they might not fit 100% to your problem they show that plugin caching is an issue which sun really should improve.
    You might have a look at them and vote for them or submit a new bug.

  • HTTP caching headers - WL6

    I'm evaluating using WL6 as my Web server but can't find a way to specify
    what caching headers should be sent with html files. I would like to be able
    to specify an expiration for the html files so client browsers will pickup
    future updates instead of pulling from non-expiring cache.
    Tim Kuntz
    [email protected]

    Here is an example of the HTTP headers for two of the images in the feed.
    Any red flags in there that you think might be causing iPhoto to think it needs to re-download the images after it's already downloaded them once?
    HTTP/1.1 200 OK
    Content-Length: 1251893
    Date: Fri, 27 Jan 2012 15:37:49 GMT
    Server: Apache
    Set-Cookie: PHPSESSID=1b9d47ae170ba8c0a088ab7124a1677e; path=/
    Expires: Mon, 24 Jan 2022 15:37:49 GMT
    Cache-Control: max-age=315360000,public
    Pragma: public
    Last-Modified: Thu, 26 Jan 2012 23:34:38 GMT
    Accept-Ranges: none
    Connection: close
    Content-Type: image/jpeg;
    HTTP/1.1 200 OK
    Content-Length: 744151
    Date: Fri, 27 Jan 2012 15:39:01 GMT
    Server: Apache
    Set-Cookie: PHPSESSID=086c112f99ccecc266a47d66d6b47733; path=/
    Expires: Mon, 24 Jan 2022 15:39:01 GMT
    Cache-Control: max-age=315360000,public
    Pragma: public
    Last-Modified: Thu, 26 Jan 2012 02:54:40 GMT
    Accept-Ranges: none
    Connection: close
    Content-Type: image/jpeg;

  • Firefox is caching my website page even http cache-control header was in use

    Hi
    Http header "Cache-Control: private, no-cache, no-store, must-revalidate" was used but firefox still caching my webpage. authenticated page of website can be viewed by using work offline feature of firefox.
    Please Help
    Thanks
    Prasant Sharma

    https://developer.mozilla.org/en-US/docs/HTTP_Caching_FAQ
    CORS was added to Firefox 29, IIRC. <br />
    https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS
    Have you looked into Page Info, specifically the Meta Tags? <br />https://support.mozilla.org/en-US/kb/page-info-window-view-technical-details-about-page

  • HTTP Caching on DMP 4400G 5.2.2

    I have created a website that cycles through images every 30 seconds.   I have pointed my DMP at this URL and the picures display and cycle as expected.   However I noticed there was no caching of the images - even with no change when the cycle went to repeat every image was downloaded fresh.
    To solve this, I have enabled Cache on my DMP 4400G (version 5.2.2).   Unfortunately, the caching seems rudimentary based on some traces I ran.
    I was expecting the DMP to continue to send HTTP GET requests for the various images, just use the if-modified-since attribute similar to the way ACNS or your local browser would implement this feature.  Instead it would appear the DMP doesn't check at all if there is a newer version of the cached file - instead it simply displays what is in its cache.  I assume the cached item would expire at some point, but there appears to be no mechanic to control this behavior.
    Has anyone played around with this feature and been able to get the DMP to do 'smart' caching?
    Thanks!
    Jason

    Hi Peter
    Thanks for the response!   In fact you're right, a HEAD request is sometimes issued for the target URL.  Upon a new load of the URL (in my case, I have it as part of a playlist), a GET seems to be always issued.  This is fine (though perhaps not optimal).
    The crux of the issue is all of the assets of the page (in my case, a number of image files) are NOT checked.   So if I swap the image files (while preserving the name), the same old image is stuck in the cache of the DMP indefinitely.
    With the NTP capabilties in 5.2.2, I'd like to suggest consideration of the if-modified-since attribute in a GET instead of the current HEAD-based checking (or perhaps in conjunction with).  In addition (and more importantly), it is important to check all of the assets of an HTML page, not just the page itself.
    I'll be happy to open a TAC case if this is the recommended course of action - I wasn't sure as I think this may be more of a feature request than a bug fix (although clearly without checking assets of an HTML page I would suggest caching is fundamentally broken).
    Please let me know your thoughts
    Thanks!
    Jason

  • Securing ability to Invalidate HTTP Cache in SMICM

    I am a security administrator at my company and I have a request to provide certain developers with the ability to invalidate the HTTP Server cache in our BI development system.  Since I am also a Basis administrator I can do this through transaction SMICM, but I do not want to provide them with that tcode.  I tried an Authorization trace to see what is checked when invalidating the cache, but the trace came up empty so I assume that the system figures if I got that far there was nothing else to check.  I'm interested in knowing if there is a function module that is called when the action is clicked that I might be able to secure and allow the developers to use that method of addressing their need.  If anyone has any ideas on this, I would appreciate your response.

    Hi,
    Check out the pdf it may help you.
    Web functionality for BW [click here|https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/994a06ed-0c01-0010-878b-e796a9060209]
    Rakesh

  • [SOLVED] Make Pacman cache packages installed with -U

    When installing a package with -U, pacman doesn't cache packages as it does when installing with -S. Is there a way to change this behavior?
    I read somewhere that specifying the package's full path and prepending "file://", as in "pacman -U file://path/to/package.pkg.tar.xz", but this often complains of an "invalid or corrupted package (PGP signature)".
    How can I make pacman always copy installed packages to the cache?
    MOD EDIT: change 'closed' to 'solved' in title to avoid confusion
    Last edited by fukawi2 (2013-07-06 13:09:21)

    You can't.  This has been discussed on the bugtracker several times.
    https://bugs.archlinux.org/task/35699
    https://bugs.archlinux.org/task/31243
    https://bugs.archlinux.org/task/15143
    https://bugs.archlinux.org/task/18012
    If you're building the packages yourself with makepkg, you can set PKGDEST to your cache in makepkg.conf.  Otherwise you could write a small pacman wrapper that copies the package after installing it.

  • Pacman cache share over LAN is a pain!

    Hi fellows. As Arch Wiki - Pacman Tips say, i setup a nfs server in my home server so all the laptops can upgrade packages more quickly and without increasing load in the servers.
    It all sounds great, but turns out there is one big problem. When a client is using the server to upgrade, the integrity check of packages takes a very long time. And believe me, downloading them is even faster than the that.
    Is there something I can tweak to make this faster?
    This is my fstab line in each client:
    myserver:/var/cache/pacman/pkg /var/cache/pacman/pkg nfs4 defaults 0 0
    And this is the line of /etc/exports in the server:
    /var/cache/pacman/pkg *(rw,async,subtree_check,no_root_squash)
    Thanks for your help in advance!

    Stebalien wrote:You could try: http://xyne.archlinux.ca/projects/pacserve/
    It doesn't use NFS but should be faster.
    I thought about using it but i didn't wanted to depend on a extra package. Before using it, i want to know if my problem is easily solvable.

Maybe you are looking for

  • Question about 'all images' in finder

    i was wondering why when i was looking in 'all images' in the finder, there were doubles of allllll of my images?

  • Schedule or automate a query or script

    Hi all, I know of the 'materialized view' where you can effectively refresh table contents at a frequency of your choosing, but I wish to run a script after my materialized view event. Is there any way to automate pl/sql code run? NT

  • Laptop WILL NOT turn on no matter what I try.

    Hi! Listen, I have tried everything. My laptop was working excellent last night, but as soon as I get up and come back, it won't even turn on. My AC adapter is still charging it as the Orange light/White light is still on. I try to boot it up but the

  • Finding missing records.

    Hi, I have a table that contains a person contact for a given centre. There may be many types of contact at a centre. However there maybe only say2 mandatory type of contact that must be setup for a centre. If one of these mandatory contacts is not s

  • Audiobooks remembering position?

    I'm considering replacing my aging Palm III and my old click-wheel iPod withe an iPad. (The Palm desktop is no longer supported, but still works - so far.) The one gripe I have about the iPod is that it usually "forgets" where I was when I last stopp