¿ªÔÆÌåÓý

yG Privacy Dashboard download #yahooprivacydashboard #ioimportpl


 

Yahoo sure seems to want to make it difficult for people to archive their groups.

I finally managed to get Mark Fletcher's Chrome application to work, and downloaded a lot of messages, but that one doesn't include any attachments. It does appear to get most of the messages though, unlike the Yahoo data. I downloaded 500 at a time, and it timed out after about 10,000 for a couple of hours, and then let me download the rest. It's a bunch of text files, which I presume I can use as MBOX files, but I've not tried yet. The downloads were quite fast, even on my slower DSL connection. That's at this link again: http://yahoogroupedia.pbworks.com/w/page/93006447/Chrome

I never managed to log in through PG Offline. I'm reinstalling it to try one last time if I can make it work. I hope I can use it as a reader, and get files from others who downloaded some of my groups. I should have downloaded it months ago. It's a wonderful program, but who anticipated that Yahoo would implode like this.

I also tried this downloader, which claims to download everything like PGO, but after getting two techies involved I still just get 'API query failed' for files and photos, but when it got to topics it downloaded a whole bunch of pornographic spam messages (!), and no legitimate posts from my groups. Others seem to have had better luck:

I suspect I'll be spending my Saturday manually downloading files and photos and any attachments I can get my hands on.

Thanks for all the help and commiseration on this list!

Isis


 

Thanks, John.

I had looked at that page a few weeks ago, but shied away from it, because I dislike Chrome so much. But I'm trying to do exactly what you're describing, salvaging as much as I can, even if it's in pieces.

I hit a wall with both PG Offline and the Chrome application, so I'm in a bit of an uncomfortable holding pattern at the moment, considering the looming deadline.

With PGO I get a perpetual login error.

And with the Yahoo Messages Application I get "Failed to load extension - Manifest file is missing or unreadable - Could not load manifest". I do see a JSON file named manifest in the folder, but have no idea if's readable. I wonder if I followed the instructions about what to do with the zip file correctly. The instructions talk about creating a directory and copying the files there, and I don't think I know what that means. I figured I was just supposed to unzip it, and then point Chrome at the unzipped folder, and that's what I did.

Any insight you might have acquired while you got it to work?

In the meantime I'm going to request another batch of Yahoo data, and hope there are additional puzzle pieces. Yahoo sent out a mailing that we can request our date until Saturday night 11:59pm PACIFIC, and that they won't delete the archives until they've fulfilled the last open requests. I'm hoping that requesting more data is a way to extend the archives just a bit longer, to buy time to download the rest.

Isis


 

On 2019.12.12 2217, isis feral via Groups.Io wrote:
I played around with it some more, and it's becoming clear now that the
Yahoo data is NOT COMPLETE. I looked up some of the attachments in the
archives, and at first thought that the missing attachments in the MBOX
files were also missing from the archives, but I found some attachments
in the archives that were just as inaccessible in Thunderbird as the
ones not in the archives.

But worse, once I could see the total number of messages displayed, I
noticed that there are only about 4,000 of 11,000 messages in one of my
groups, and it looks like it's similar with the other groups. Then I
noticed that entire stretches of time were missing from the files. I
have files up to December 2017, but the next batch doesn't start until
August 2019. There are also all sorts of bizarre posts with weird dates,
just showing up at the beginning of some of the folders, so I have posts
that look like they're from 1998 or 1969, when there was no such group,
just inserted with current posts. Some of those weird posts are blank,
others are legitimate posts, some seem to be partial posts, and they
have no subject line, and have weird sender names.

Another demonstration of Yahoo's reliability....or lack thereof. Once I
got everything of mine off that system that I still can, I'm going to be
happy to walk away, and I'll never look back.

Isis
Interesting.

In the interest of completeness I've been acquiring data in any form I can from Yahoo! Groups.

The Y!G bundles, and other apps.

PGO (for Photos, I'm a group Owner so can do that, download photos.)

Also two Chrome Browser Applications (not an extensions.)

These have been used quite successfully in Vivaldi Browser (it too is Chromimum based.) The Applications wouldn't display a button as in the description/instructions so I brought them out/made visible by entering vivaldi://apps in the address bar. I then saved that on my Speed Dial page, though a Bookmark would work as well.

The one was able to grab a multi-column list of _all_ the members (no 1000 limit). Done.

Chrome Application to Download Members
Application to Download Members

The other was to download messages.

Chrome Application To Download Messages
Application To Download Messages

I took it a bit at a time, balancing a concern for the deadline and the block by Yahoo! if one tries to gobble too much at a go. Blocks of 200 to 500 to start. Being on the west coast of NA I noticed that when most folks east of me settled down for the evening I could get more done, quicker. I was able to download 1000 msgs at a time. Looking in to the saved text files of these they appeared fairly complete. I've not had time or further inclination yet to check them.

One reason I chose this belt and braces (suspenders if you prefer) approach was this application downloads the Yahoo! Group message number with the messages.

One should be able to build a number of different things from these files. I have noticed that messages that are deleted still seem to present a number, however may not appear as messages to us. This might be part of some of the differences between the total number or messages passed and what one can find. Not sure.

Cheers, John


 

Isis,

I was able to get the MBOX files to display in Thunderbird, but none
of the attachments are functional.
The mbox files have only placeholders where the attachments ought to be. The attached files, if delivered to you, are in the photos_and_attachments folder. This reflects the fact that Yahoo Groups stored the attachments separately (sharing the 100 GB storage pool with Photos), if it stored them at all - which was a group option.

I also heard a rumor that the shutdown of the archives might be
postponed until the end ... of January, but I see no official notice
to that effect.
See the re-worded help page:


The removal of user access to content is still scheduled for "after December 14th". At which point manual or PG Offline downloads will no longer be possible.

The only thing that has been extended is the deadline for making a getmydata request.

Shal


--
Help: /static/help
More Help: /g/GroupManagersForum/wiki
Even More Help: Search button at the top of Messages list


 

I played around with it some more, and it's becoming clear now that the Yahoo data is NOT COMPLETE. I looked up some of the attachments in the archives, and at first thought that the missing attachments in the MBOX files were also missing from the archives, but I found some attachments in the archives that were just as inaccessible in Thunderbird as the ones not in the archives.

But worse, once I could see the total number of messages displayed, I noticed that there are only about 4,000 of 11,000 messages in one of my groups, and it looks like it's similar with the other groups. Then I noticed that entire stretches of time were missing from the files. I have files up to December 2017, but the next batch doesn't start until August 2019. There are also all sorts of bizarre posts with weird dates, just showing up at the beginning of some of the folders, so I have posts that look like they're from 1998 or 1969, when there was no such group, just inserted with current posts. Some of those weird posts are blank, others are legitimate posts, some seem to be partial posts, and they have no subject line, and have weird sender names.

Another demonstration of Yahoo's reliability....or lack thereof. Once I got everything of mine off that system that I still can, I'm going to be happy to walk away, and I'll never look back.

Isis


 

I'm just getting back to this again now, trying to figure out what's actually there in those massive files from Yahoo. I was able to get the MBOX files to display in Thunderbird, but none of the attachments are functional. They don't have any extensions. I've tried to open some that I know are PDF, but my reader wouldn't open them. I even downloaded one and changed the extension to PDF, but that didn't work either. The Yahoo data came with a folder of attachments, but they don't seem to be connected to the posts, and the attachments I tried are not included. All the attachments Thunderbird shows are called 'Part 1.2'. Is there something I'm missing? Has anyone figured out how to open message attachments in the Yahoo data files?

Yahoo sent out something a few days ago, warning of the imminent shutdown of the archives, in which they said if I request my data, they will delay the shutdown of those archives until they finish sending me the data. So it seems like requesting data again, even if we have it already might give us a little more time to maneuver. I also heard a rumor that the shutdown of the archives might be postponed until the end of the month or even the end of January, but I see no official notice to that effect. Has anyone else heard anything about that? I sure could use another couple of weeks to download all my data myself, but getting PG Offline at the last minute, if I don't know if there'll be anything left to download come Saturday, doesn't make much sense to me.

Isis


 

On Thu, Dec 12, 2019 at 07:08 PM, <andyt650@...> wrote:

When I am done with the Perl script, will the original time stamp and posting
number and order be preserved?
Posting number is not preserved. Order is (within the import).

Original time stamp by default is inserted into beginning of message body. There is no way (without Mark's intervention) to make imported messages to appear at original time, except you can ask Mark to import messages.zip for $220 per group.

--
Lena


 

Hi Lena,

When I am done with the Perl script, will the original time stamp and posting number and order be preserved?

Thanks,

Andy


 

On Wed, Dec 11, 2019 at 10:42 PM, <andyt650@...> wrote:

how your Perl script handles the "ownership" of each post. When your script is
finished uploading, will the "By XXXXX" in each post? list? the original?
email address of the Yahoo member who posted the message?
Yes if that member is currently subscribed to the .io group (present in memberlist.csv) and neither moderated nor bouncing.

Or will it be ME in
the "By"? (I'm running the script)?
You subscribe a separate mailbox for this purpose (for messages from members not currently subscribed).

when your Perl program parses and makes substitutions from?
memberlist.csv, needchange.txt, and map.txt , these will not affect "BY XXXX"
in each post? correct? These changes will only affect the FROM: in the meat
of the post which is on the first line of the post?
If a member is in map.txt then the new address will be in "By".

--
Lena


 

Hi Lena,
I am not a programmer and so I was wondering if you can help me clarify how your Perl script handles the "ownership" of each post. When your script is finished uploading, will the "By XXXXX" in each post? list? the original? email address of the Yahoo member who posted the message? Or will it be ME in the "By"? (I'm running the script)?

Also, when your Perl program parses and makes substitutions from? memberlist.csv, needchange.txt, and map.txt , these will not affect "BY XXXX" in each post? correct? These changes will only affect the FROM: in the meat of the post which is on the first line of the post?

I am one of the unfortunate people not given administrator rights to my? Yahoo group until after Dec 1.? I have downloaded my Yahoo archive through their free service and it is all intact and I was able to confirm the content via Thunderbird. I am also planning to upgrade to Premium so that I can populate the members manually. I am trying to understand how the message will look like after I run your program.? I would like to know if "BY XXXXX" will reflect the original poster, and whether the original FROM will be shown in the meat of the post but altered by your script. Appreciate any assistance you can to clarify for me.

Thanks,

Andy


 

I was so glad you mentioned getting different results from your two downloads. After my request on October 25 was returned on October 27 with no data, I was despondent about their download service, but do go on to figure out a way to download the extremely bloated view of messages from the yahoo groups web site... and Files. And Photos. And Attachments.

I didn't find your message until November 21, and immediately sent of another request... which was not returned until November 28, but it contained the mbox format archives that you refer to, and the files from Files area.

Unfortunately, Attachments are not included in the downloads, but since I had already captured them, it is a matter of correlating them to make a decent archive.

Again, thank for the inspiration to try again.


 

On Fri, Nov 29, 2019 at 11:49 PM, Simon Child wrote:

The blocker was that groups.io blocked message import from the script after 40
messages and told me to try again later. About an hour later I still couldn't
import any more messages from that email address
I added a delay 50 seconds between messages to avoid the block. The number 50 can be changed in the beginning of the program.

--
Lena


Simon Child
 

Thanks for this

I ran into two more issues. The blocker was that groups.io blocked message import from the script after 40 messages and told me to try again later. About an hour later I still couldn't import any more messages from that email address (I reconfigured to use a different email address for import and processed another 40 messages before that too got blocked).

The other issue was that your script said my mbox archive had "9022 messages (deleted messages excluded)" But Thunderbird finds the expected 15,000 messages in the mbox which matches the number in seen in PG Offline and in Yahoo itself.

Also, imported messages show the date and time of import, not of the original message

So yesterday, in the window while groups.io was open to accept new transfer requests, we booked a transfer. But in case they don't manage to get it done I've also requested a fresh mbox download from Yahoo!

Thanks for your sharing your work.

?


 

On Fri, Nov 29, 2019 at 12:31 AM, Simon Child wrote:

I'm trying this, and getting error on the first message

Modification of a read-only value attempted at ./import.pl line 218, <GEN0>
line 5.
I fixed the bug. Please download the program again.

What is the benefit of rewriting the From line,
Mostly to avoid moderation of messages from non-members.

I have not created all the new members (our group is not premium and so I
cannot bulk add them) and I have not created map.txt. Is an accurate/complete
map.txt mandatory or optional?
Optional. If the email address is not currently subscribed to the .io group then From: line is rewritten but original From is inserted into beginning of message body.

--
Lena


Simon Child
 

So, I've experimented...

If I comment out line 218 the import appears to work OK!

The From line is not rewritten. But is it necessary for it to be rewritten? I stopped the import after two messages. These both show the original author email and name in From, and also prepended at top of body of message. I'm assuming no emails were sent because they should have been blocked by the hashtag #mi

What is the benefit of rewriting the From line, and if it is important to do that how do I fix the erroring on line 218?

Thanks


Simon Child
 

Hi,

I'm trying this, and getting error on the first message

Modification of a read-only value attempted at ./import.pl line 218, <GEN0> line 5.

Anybody overcome this?

I have not created all the new members (our group is not premium and so I cannot bulk add them) and I have not created map.txt. Is an accurate/complete map.txt mandatory or optional?

Thank you


 

On Thu, Nov 28, 2019 at 07:03 AM, Derek Milliner wrote:
MBOX format is a standardised format for, well, mailboxes.
If you plan to have the mbox file imported by GIO, please note the details at

Duane
--
Help: /static/help
GMF's Wiki: /g/GroupManagersForum/wiki
Search button at the top of Messages list
A few site FAQs: /static/pricing#frequently-asked-questions


 

Isis,

MBOX format is a standardised format for, well, mailboxes. Personally I use the Thunderbird email client (free) (with the 'ImportExportTolls NG' add-on (also free) which knows how to read & write mbox files. Other tools/mail clients are available!

Regards,

Derek


 

Thanks, BMaverick.

I actually don't use WiFi at all usually, or any wireless technology. I'm completely wired at my house, but my DSL connection is the slowest version there is, so a couple of GB take an entire day to download. I didn't see your response before I went to my friend's, but I'm happy to report that her WiFi connection got the job done. Just barely downloaded it all in an hour, so just under the wire (no pun intended). I don't think I would have been able to connect the cable there, since it's a shared connection, with the router far away from the computer. Very relieved it worked out though.

As I'm unzipping the files, I can see that all my groups are represented, but I can't yet figure out if it's all the messages of each group, or just my own. There are many more folders within the folders that need to be unzipped, so that's going to take a while, and then I have to figure out how I can actually look at them. I thought it would be JSON files, but it don't see any file extension, though the file names say MBOX, so I'm a little confused about how to open them.


 

ISIS,

To help with the download speed, DO NOT use WiFi.? WiFi cuts the speed by almost half.? Instead run the correct cable from the WiFi router to your computer.? Now you can download at much higher speeds.

BMaverick