Yahoo Groups going away

24 Oct 2019

have been reasonably successful, after making a few mods, in backing up Yahoo groups using
a clone of "Yahoo Group Archiver" which broadly works (but see below) and
doesn't need  any scraping tools.
I made a few tweaks concentrating on speed rather than documenting the code and the
current mess is here:-

https://1drv.ms/u/s!Ag4BJfE5B3onleMG29vMs5czmPcoTw?e=TrqawF

The script yahoo.py is supposed back up things to files. I couldn't get use the
user/password login part to work, but noted scripts also have support for putting the
cookies in the command line.
So I downloaded cookie manager for Firefox, logged into Yahoo and added code to set the
values at the top of the code. The result is "yahoo1.py" Its pretty obvious
which cookies are needed.
I found this fails on unnamed photo albums. I also found file download flaky. So Yahoo2
will fix photo albums with duff names and skip downloading any existing files.
This leaves one bug. If a download fails the script may leave an empty file. If you want
to restart that download you need to remove it before restarting the download.
Sometimes Yahoo barfs at a file because updated av/malware scanners mark its as bad. E.g.
archives which contain "netcat" In this case leave the partial download in place
and allow the script to skip
We also don't get file descriptions.

I am running the scripts on Windows/10 on Python 3.7.5 on Windows/10 and use
"py" to run the scripts
When installing the required "requests" package (see the readme.md) I found I
had to enter the full path to pip (Its it the scripts folder)

I am happy to answer any questions but note I am in the UK (that is East Pondia not the
University of Kentucky) so please allow for my time zone.

Dave
G4UGM
P.S. I now hate python....
PPS I also now hate Yahoo.

...
  -----Original Message-----
 From: cctalk <cctalk-bounces at classiccmp.org> On Behalf Of Steve Malikoff via
 cctalk
 Sent: 24 October 2019 02:51
 To: General Discussion: On-Topic and Off-Topic Posts <cctalk at classiccmp.org>
 Subject: Re: Yahoo Groups going away

 Jim said
  On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk
wrote:
  > Yeah,
it sucks. The Tomy Tutor users group has been there for
> years, and I guess we'll jump over to groups.io. I managed to
> archive everything last night.
 What's your strategy for archiving material off YahooGroups? Their
 Files and Photo (photostreams) sections are so heavily
 Javascript-encrusted that it's not at all easy to bulk archive from
 them. I tried a few tools (httrack, wget,
 curl) with no valid results, but I only used some basic settings.  For the
messages, I used

 	https://github.com/andrewferguson/YahooGroups-Archiver

 Unfortunately, the (rather inadequate) Y!G API for files makes it
 difficult to iterate over files in a directory tree. I ended up
 manually downloading them, since it was only about 30 files and not
 worth ginning up something to scrape them. Some people have used

 	https://github.com/csaftoiu/yahoo-groups-backup  I didn't get that to work. 
Has anyone here got suggestions? Contact
 off list.  It is getting errors, and I spent about an hour trying to
 figure it out.

 every issue was a bug in either Python that was unresolved, or the
 tools they were using, not errors in the tool, so I'm not really
 interested in a lot more debugging.

 I suspect it ran at some point, maybe I've got the wrong versions of
 some sort.

 thanks
 Jim
> to get everything but it needs a MongoDB instance which seemed kind
> of overkill for a one-time dump.  

 I set it up with python 3.7.3, pip installed the required modules such as
 Selenium, installed geckodriver for Firefox (but I don't run Firefox on this
 machine, I use a popular fork) and it emitted an error that referes to Selenium
 not being the correct match to Firefox.
 I have other things to do so that's where I left it for now, will try it out again
 sometime soon with an earlier actual Firefox.

 Steve. 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Yahoo Groups going away