AnimeSuki Forums

Register Forum Rules FAQ Members List Social Groups Search Today's Posts Mark Forums Read

Go Back   AnimeSuki Forum > Support > Tech Support


Thread Tools
Old 2006-05-07, 17:31   Link #1
Junior Member
Join Date: May 2006
Scarywater's Scrape info

Anyone know why scarywater scrape info does not return the 'downloaded' value? It always returns 0.

"d5:filesd20:q(֫H d8:completei26e10:downloadedi0e10:incomplete i16eeee"

The actual html listing has the number of downloadeds in the 'dls' field though.

ofke0m is offline   Reply With Quote
Old 2006-05-08, 01:50   Link #2
Junior Member
Join Date: May 2006
Answering my own thread. It looks like it is an issue with the tracker, hypercube, it seems to always return 0 or 1 for the 'downloaded' value.
ofke0m is offline   Reply With Quote
Old 2006-05-09, 07:57   Link #3
Don't panic.
Join Date: May 2003
Location: Galactic Sector ZZ9 Plural Z Alpha
Age: 32
I took a peek at the hypercube source. Please note that I have no access to the scarywater scripts, so that part is all guesswork.

This behaviour is triggered by the scripts running at Scarywater's http server. Scarywater's backend scripts scrape the data off the tracker regularly at a high pace (every 5 minutes or so) to update the stats on the web site. With every request it resets the downloaded stats. It then adds the value it got to its own database. So what you basically get when you scrape is "downloaded" since last scrape issued by scarywater.

I explain better in pseudo-code than in English.
Scarywater's scripts:
def update_stats (seeds, peers, complete)"UPDATE stats SET seeds = ?, peers = ?, complete = complete + ? WHERE info_hash = 42", seeds, peers, complete)

def scrape_every_n_minutes
  seeds, peers, complete = http.get "http://tracker/status?raw&reset=1"
  update stats(seeds, peers, complete)

Receives status query, notes that there's a reset parameter included, resets downloaded count to zero. See tracker.c line 532 (functions
serve_status_raw and serve_status_html).

Another thing I noticed that it seems that anyone can issue a stats reset, hence you can practically kill scarywater's downloaded stats collection. Assuming mxs didn't plug that hole by restricting the reset calls only to a single IP.

How to avoid using the reset function, then? By adding a new database column [1] that keeps track of completed values and can detect a tracker reset.

[1] I've been bugging him for ages to do this, so that the RSS feed would behave better.

(actually, any browser is fine to me as long as it doesn't use IE's rendering engine...)
Generating the #animesuki stats since 2003
hhaamu is offline   Reply With Quote
Old 2006-05-09, 10:03   Link #4
Join Date: Nov 2003
Age: 44
Send a message via MSN to Sylf Send a message via Yahoo to Sylf
Since scarywater's main site is in xhtml (and thus a clean xml document), you *could* scrape the actual xhtml page and get those values ^^;;;; And use the file name as the key for retrieving the values you want.
Sylf is offline   Reply With Quote
Old 2006-05-13, 15:00   Link #5
that guy
Join Date: Jan 2003
Location: Germany

Maybe I can shed some light on this. I never treated the hypercube-generated stats (which are returned on scrapes) as anything more than volatile -- as such, I have never taken great care to protect them or keep them accurate (as they will just be lost on a program restart, anyway).

Stats on the website are generated from the hypercube logfile, and, as hhaamu correctly surmised, stored in a database. The data displayed on the website is from that database, and should be a more accurate than the hypercube (especially across invocations

If you absolutely, positively need up to the minute, current stats for a torrent, the website is your best bet (updated every 5 minutes). Hypercube is written more for speed than features
(Fun tidbit : is not restricted either, and neither are any resets -- as the data is a purely cosmetic byproduct of hypercube's operation, anyway

mxs is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -5. The time now is 08:03.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2018, vBulletin Solutions Inc.
We use Silk.