On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk
wrote:
Yeah, it sucks. The Tomy Tutor users group has been
there for years, and I
guess we'll jump over to groups.io. I managed to archive everything last
night.
What's your strategy for archiving material off YahooGroups? Their
Files and
Photo (photostreams) sections are so heavily Javascript-encrusted that it's
not at all easy to bulk archive from them. I tried a few tools (httrack, wget,
curl) with no valid results, but I only used some basic settings.
For the
messages, I used
https://github.com/andrewferguson/YahooGroups-Archiver
Unfortunately, the (rather inadequate) Y!G API for files makes it difficult
to iterate over files in a directory tree. I ended up manually downloading
them, since it was only about 30 files and not worth ginning up something
to scrape them. Some people have used
https://github.com/csaftoiu/yahoo-groups-backup I didn't get that to work.?
Has anyone here got suggestions? Contact off
list.? It is getting errors, and I spent about an hour trying to figure
it out.
every issue was a bug in either Python that was unresolved, or the tools
they were using, not errors in the tool, so I'm not really interested in
a lot more debugging.
I suspect it ran at some point, maybe I've got the wrong versions of
some sort.
thanks
Jim
> to get everything but it needs a MongoDB instance which seemed kind of
> overkill for a one-time dump.
I set it up with python 3.7.3, pip installed the required modules such as
Selenium, installed geckodriver for Firefox (but I don't run Firefox on this
machine, I use a popular fork) and it emitted an error that referes to Selenium
not being the correct match to Firefox.
I have other things to do so that's where I left it for now, will try it out
again sometime soon with an earlier actual Firefox.
Steve.