Yahoo Groups going away

steven at malikoff.com steven at malikoff.com
Wed Oct 23 20:50:32 CDT 2019


Jim said
> On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk wrote:
>>>> Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
>>>> guess we'll jump over to groups.io. I managed to archive everything last
>>>> night.
>>> What's your strategy for archiving material off YahooGroups? Their Files and
>>> Photo (photostreams) sections are so heavily Javascript-encrusted that it's
>>> not at all easy to bulk archive from them. I tried a few tools (httrack, wget,
>>> curl) with no valid results, but I only used some basic settings.
>> For the messages, I used
>>
>> 	https://github.com/andrewferguson/YahooGroups-Archiver
>>
>> Unfortunately, the (rather inadequate) Y!G API for files makes it difficult
>> to iterate over files in a directory tree. I ended up manually downloading
>> them, since it was only about 30 files and not worth ginning up something
>> to scrape them. Some people have used
>>
>> 	https://github.com/csaftoiu/yahoo-groups-backup
> I didn't get that to work.  Has anyone here got suggestions? Contact off
> list.  It is getting errors, and I spent about an hour trying to figure
> it out.
>
> every issue was a bug in either Python that was unresolved, or the tools
> they were using, not errors in the tool, so I'm not really interested in
> a lot more debugging.
>
> I suspect it ran at some point, maybe I've got the wrong versions of
> some sort.
>
> thanks
> Jim
>> to get everything but it needs a MongoDB instance which seemed kind of
>> overkill for a one-time dump.


I set it up with python 3.7.3, pip installed the required modules such as
Selenium, installed geckodriver for Firefox (but I don't run Firefox on this
machine, I use a popular fork) and it emitted an error that referes to Selenium
not being the correct match to Firefox.
I have other things to do so that's where I left it for now, will try it out
again sometime soon with an earlier actual Firefox.

Steve.



More information about the cctalk mailing list