DNS server problems at Akamai lead to several major sites being
unreachable.
Yesterday's blackout of Apple's and other major web sites is was
apparently caused by a mysterious Internet attack on Akamai name
servers.
BOSTON - A service disruption at content hosting company Akamai
Technologies Inc. cut off access to some of the Internet's major Web
sites Tuesday, including Google.com, and Microsoft.com, according to
The SANS Institute's Internet Storm Center.
AP - A diplomat who visited the site of a huge explosion in North
Korea said Friday he saw no evidence it was caused by a nuclear test,
and South Korean officials said mushroom-shaped plume thought to be
from the blast may have instead been a natural cloud formation.
Turns out after all the various speculation on reasons, the big
northeast blackout from last summer
that was only just discovered - and not (as many had
speculated) from the Blaster worm that was going around at the time.
The bug meant that an alarm that should have been triggered never went
off - leading to a series of problems that went unnoticed until it was
too late.
The devastating earthquake at Bam, Iran, in 2003 was caused by a rare,
hidden fault that is invisible at the surface, researchers have
claimed.
French investigators say design faults caused the accident which
killed 15 on the world's largest luxury liner.
AP - A man who shut down the city's largest reservoir after he tossed
a bag containing dirty underwear over its fence was ordered to pay
$5,000.
Due to a power failure affecting all of Internap's data center,
LiveJournal is currently completely inaccessible, and we're waiting
on...
My host's server died yesterday and didn't come back until this
morning. Sorry for the interruption. I don't know yet what will happen
to email you sent me yesterday. Apparently it's all going to arrive
soon. Sorry for the inconvenience....
out·age (ou?tij) noun
- A quantity or portion of something lacking after delivery or
storage.
- A temporary suspension of operation, especially of electric
power.
When I woke up yesterday after a brief sleep I started to log back
in to different services and as I'm seeing something's funny with my
server, Jim over at #mobitopia
asks "is your site down?".
Damn.
As I checked what was happening, I could see that all sorts of
things were not working on the server. I was starting to fear the
worst ("the worst" in abstract, nothing specific) when I remembered
that I had seen similar symptoms a couple of months ago, and back then
it had been a disk space problem. I run "df" and sure enough, the
mountpoint where a bunch of data related to the services (including
logs) is stored was full (since November the number of pageviews a
month has increased to over 200,000, which creates pretty big
logfiles). As the last time, the logs were the culprits. Still
half-asleep, I start to compress, move things around and delete files,
when suddenly after a delete I stop cold: "No such file or
directory".
What? But I had just seen that file...
I look up the console history and four rm commands had
failed similarly.
Uh-oh.
I run "pwd". Look at the result. "That's not right...". I was
not where I thought I was.
At that point, I woke up completely. Nothing like adrenaline for
shaking off sleepiness.
I look through the command history. At some point in my switching
back and forth from one directory to another, I mistyped a "cd -"
command and it all went downhill from there. Adding to the confusion
was the fact that I used keep parallel structures of the same data on
different partitions, "just in case". I stopped doing that once I got
DSL back in May last year, opting instead to download stuff to my home
machine, but the old structure, with old data, remained. And, even
more, my bash configuration for root doesn't display the current
directory (the first thing I did after I realized that was add $PWD to
the prompt, but of course by then it was too late).
I had just wiped out the movable type DB, the MT binaries
(actually, all the CGI scripts), the archives, and a bunch of other
stuff in my home directory.
I took a deep breath and finished creating space, and moved on.
First thing I did was restart the services, now that disk space
wasn't longer an issue. Then I reinstalled the binaries that I had
just wiped out, which I always keep in a separate directory with some
quick instructions on how to install them. That turned out to be a
lifesaver, one of the many in this little story.
After that I put up a simple page that explaining the situation (he
re's a copy for... err... "historical reference"), plus a
hand-written feed and worked on the problem in breaks between work.
Then I realized that all the links that were coming in from the
outside (through other weblogs, google, etc) were getting a 404. So as
a temporary measure I redirected the archive traffic to the main page
through a mod_rewrite clause:
RewriteRule
/d2r/archives/(.*) /d2r/ [R=307]
That would return a temporary
redirect (code 307) while I got things fixed (one fire out! 10 to
go).
So what next? The data of course. When I came back to Ireland at
the beginning of January I started doing backups of different things
(a "new year, new backups" sort of thing), and I backed up all the
server data directories on Thursday, and then on Saturday I did what I
thought was a backup of my weblog data, through MovableType's "Export"
feature. As things turned out, the latter proved useless, and it was
the "binary" backup that saved the day.
Why? Well, as I started looking at things, I went to MT's "import"
command in cavalier fashion and was about to start when the word
"permalink" popped up in my head. Then it grew to a question: "What
about the permalinks?".
The question was valid because my permalinks are directly based on
the MT entry ids. Therefore, if an import changed the entry IDs, it
would also break all the permalinks. I started cursing for not
switching over to using entry-based strings for permalinks, but that
didn't help. So I did a little digging and I realized that I was
right. MT assigns entry IDs on a system-wide basis. So if you have
multiple weblogs on the same DB (which I have, some of them private,
some for testing, etc) OR if you have to recover the data from an
export (which I had to do) you're out of luck. More likely than not,
the permalinks will not work anymore. The exported file did not
include IDs. Re-importing would generate the IDs again. Different IDs.
Different links. Result: broken links all over the place, both within
the weblog and from external sources.
This is clearly an issue with the MT database design, which doesn't
seem too well adapted to the idea of recovery. To be fair, however, I
am not sure how other blogging software deals with this problem, if at
all. I think this is one big hole in the weblog infrastructure that we
haven't yet completely figured out, both for recovery and for
transitions between blog software (As Don noted recently).
This is when I started thinking that things would have been much
easier if I had written my own weblog software. :) That thought would
return a few times over the next 24 hours, but luckily I was busy
enough with other things not to indulge in it too much.
After looking online and finding nothing on the topic, I came to
the conclusion that my only chance was to do a direct restore of the
"binary" copy (that is, replacing the clean database with the backup
directly) I had from last Thursday. I did the upload, put everything
in place, and things seemed to go well, I could log in to MT and the
entries up to that point where right where they had to be. So far so
good. I was going to do a rebuild and I thought that maybe now was a
good time to close off all comment threads in all entries (to avoid
ever-increasing comment spam) and I spent some time trying to figure
out how to use the various
MT tools to close comments on old entries. However, they all seem to be ready
for MySQL rather than BerkeleyDB. It wasn't a hard decision to set it
aside and move on.
So I started a full rebuild. The first 40 entries went along fine,
albeit slowly. Then nothing happened. Then, failure. I thought for a
moment that, for some strange reason, the redirect I had set up
yesterday was causing the problem, so I removed it, restarted the
server, and Tried again. Failed again. No apparent reason.
I got angry for a second but then I remembered that the "binary"
backup was of everything, including the published HTML files.
Aha! I uploaded those,crossed my fingers, and did a rebuild only of
the index files, and everything was up again. Actually, this was
important for another reason, since the uploaded images that are
linked from the entries end up by default in the archives
directory, you need a backup of that or the images (and whatever else
you upload into MT) will be gone if you lose the site.
So the solution up until this point had been a lot simpler than I
thought at the beginning.
But wait! All the entries after last Thursday were missing, and I
didn't have a backup for those. That was when RSS came to the rescue
in three different forms: 1) I download my own feeds into my
aggregator, so there I had a copy up to a point. 2) Some kind souls,
along with their condolences for the problem, sent along their own
copy of the latest entries (Thanks!!--and Thanks to those who sent
good wishes as well). 3) Search engines, (Feedster was the most up to
date--btw, it was Matt that
suggested yesterday, also on #mobitopia, that I check out Feedster as
a source of information, a great idea that really applies to many
search engines if their database is properly updated), had cached
copies that I could use to check dates and content. So armed with all
that information I set out to recreate the missing entries.
Here the problem of the permalinks surfaced again. I had to be
careful on the sequencing, or the IDs wouldn't match. So I re-created
empty entries, one-by-one, to maintain the sequencing (leaving them
unpublished), actually posted a couple
a> of updates<
/a> of what was going on, and then I published the recovered entries
as I entered the content and set the right dates.
So. All things are restored now (except for the comments from the
last week, which are truly lost--this makes me think that setting up
comment feeds would be a good idea. However, that doesn't address how
would I recreate the comments given what happened. Would I post them
myself under the submitter's name? That doesn't seem right at all.
Another problem with no obvious solution given the combination of
export/ID issues with MT).
What's strange is that there's been slight a breakdown in
continuity now, because I did "post" some updates to that temporary
index file, but it couldn't be part of the regular blogflow. Hopefully
this entry fixes that to the extent possible.
Okay, lessons learned?
- Backups do work. :) I am going to do
another full backup today, and I'll try to set up something automated
to that effect. (Yes, I know I should have done it before, but as
usual there are no simple solutions, and then you leave it for the
next day... and the next...). Plus, backups for MT installations,
should always be both of the DB and the published data, to make
recovery quick. (I have about 1500 entries, which amount to something
like 20MB of generated HTML--additionally, the images are posted
directly on the archives directory, so if you're not backing that up,
you've lost them).
- For MovableType, the export feature is not so great as far as
backups are concerned. The single-ID-per-database problem is a big one
IMO, and I don't think MT is alone in this. We need to start looking
at recovery and transition in a big way if weblogs are going to hit
the mainstream (and we want permalinks to be really permanent)
- Solutions are often simpler than you think, if you have the right
data. Having a full backup makes recovery in this case easy and fast.
- This stuff is still too hard. What would a less
technically-oriented user do in this situation? Granted, it was my
knowledge (since I was fixing stuff directly on the server) that
actually created the problem in the first place, but there are
lots of ways in which the same result could have been "achieved",
starting from simple admin screwups, hardware failures, etc.
Overall, this has been a wake-up call in more than one sense, and
it has set off a number of ideas and questions in my head. How to
solve these problems? I'll have to think about it more.
Anyway. Back to work now, one less thing on my mind.
Where was I?
Power Outage
Power Outage
12/14/2002 07:13 PM
It's raining and blowing like mad in the Bay Area today. I just had a
3.5 hour power outage. Yuck. Oh, well. It could be worse. At least it
doesn't snow here....
Host Outage
Host Outage
07/13/2004 03:22 PM
Our web host had emergency maintenance last night that lasted
nearly 12 hours. They took the site down and put up a older drive
which had dated news. We apologize for the confusion. Nothing like
messing up posting of the daily articles.
Planned outage
Planned outage
03/25/2005 09:07 PM
NewsGator Online will be down for approximately 8 hours starting
Saturday, March 26 at 9:00am MST. We will be implementing a major
system upgrade to enhance our service...
Technology Said to End Errors in Chips
Caused by Radiation
Technology Said to End Errors in Chips
Caused by Radiation
12/15/2003 02:25 AM
New York Times Dec 15 2003 1:55AM ET
Externally Linked CSS and JS Caused Drop
in Traffic
Externally Linked CSS and JS Caused Drop
in Traffic
12/07/2002 08:31 AM
Although it makes site updating easier, external files for css and/or
js do increase download time and may have a negative effect on search
engine rankings.
On representing the backlog caused by an
absence of cerebral RAM...
On representing the backlog caused by an
absence of cerebral RAM...
06/12/2004 04:32 AM
That period before a launch is always stressful. This time is no
exception. It's occupying my entire head almost 24/7 no matter whether
I try and leave work on time or whether I'm there for twelve or
fourteen hour days. It doesn't make any difference. It's just there in
my head and it probably will be until a couple of weeks after it's
finally launched. C'est la vie. It's the nature of the beast.
In real life, of course, people can sense when you're busy and
don't feel particularly upset if you aren't able to give them the time
that you would like to. They might not be thrilled about it of course,
but they understand. But the signals that I can give off in public
through my weblog are less clear. Has he just abandoned the thing? No.
Why doesn't he have anything interesting to say anymore? Well, I do!
Probably more than ever at the moment. I just can't find the headspace
to work with to write them down. Why isn't he commenting on that thing
that's so obviously one of his core interests? Well, it's because I'm
not commenting on anything - the only creative thing I'm able to do
outside work at the moment is doodle in Illustrator.
What I need is some way of actually ambiently reflecting my
personal weather - without all that clunkiness of actively choosing
states of mind. What I actually need is some way of representing that
I'm just really really behind... A first suggestion - some way of
representing the number of unread posts I have in NetNewsWire at any
given moment (currently way over six hundred). Except that my path of
posting tends to be more circuitous than that. NetNewsWire posts get
opened in browser tabs if they look interesting, read thoroughly and
then (if they're not something I want to follow-up upon) they get
immediately closed. The number of open tabs reflects pretty much
exactly the number of things I actively want to talk about at any
given moment. If there are lots open, it probably means that I have a
lot I want to write about and no time to do it in. Except that doesn't
work either, because in addition to the six hundred things in
NetNewsWire I haven't filtered and the fifty tabs I have open at the
moment, I also have four folders in my bookmarks called "State of Play
1-4" that were the sum total of all the things I wanted to talk about
and had open in Safari but then had to store quickly so that I could
install a Max OSX update. That's another two hundred discussions I
really want to get involved in - that I want to contribute to. And
then there's the four or five little projects I have on the side that
I've been trying to write up but have been incapable of doing so.
So six hundred unfiltered posts, fifty open tabs representing fifty
filtered posts to talk about, two hundred bookmarks representing two
hundred even more filtered conversations to get into, plus four or
five multi-page documents (one around 6,000 words) that have been
growing in the sidelines that I'm unable to push out into the world in
any effective way. That is the index of how busy and behind I feel.
That is the measure of my total absence of cerebral RAM. Do you now
understand why I'm not posting that much?
Read the
comments
Weird context shifts caused by IM on
hiptops...
Weird context shifts caused by IM on
hiptops...
12/22/2004 01:40 AM
I'm having a crisis of etiquette caused by what I believe to be bad
user interface design. Basically it works like this. I look at my
iChat buddy list (to the right) and I see a big list of people who are
'green' (indicating availability), 'orange' (indicating absence or
idle-ness) or 'red' (indicating explicitly 'away', but still
contactable if necessary).
Now my expectation of people on my iChat list is that if they are
green they are currently using their computer at this precise moment.
They're actually looking at the screen. Which means that a ping to
them should be incredibly unobtrusive but noticeable and should
involve the absolute least number of keystrokes / interactions to be
able to tell someone you're busy and/or start a conversation with
them. Actually, iChat doesn't really handle that totally brilliantly
in a range of ways, but the aspiration should remain. The ping should
be non-invasive but immediately cognitively recognisable, and a
response should be as simple as possible. It is with the understanding
that the recipient's experience will be something like this that we
are able to ping our friends or colleagues without feeling like we're
being necessarily rude.
Except that this presumptive understanding of the experience of the
person at the other end of the connection is starting to deteriorate.
At least three or four of the people I have on my IM list are now
accessing their IM via their hiptops. This changes the experience
immediately - firstly because the recipient is now not necessarily
engaged in a looking-at-a-screen-like activity. They could be walking
in a fish market. They could be chatting to their mother on a phone.
They could be driving a car. Secondly in order for them to react to
the messages they're receiving they have to physically move the device
to a place where they can focus upon it. The casual ping is
immediately an intrusive one. And then - of course - they have to find
a way to respond to the ping - either by using slow phone-style or
fold-out keyboards, or by changing their presence. None of these
actions are simple or quick enough to make the experience of using a
hip-top and responding to messages on a hip-top comparable with
responding via a computer keyboard.
All of which would be fine if it wasn't potentially difficult to
distinguish between a person being rudely invasive and a device that
encourages potentially invasive attempts at social intercourse... And
if it wasn't - in turn - difficult for the person sending a message to
distinguish between a long silence that resembles some kind of
'shunning' activity and a long silence that is merely a consequence of
circumstances or the difficulties in getting to your messaging. On
both sides there are social problems that emerge because the behaviour
of the interfaces is confused with the behaviour of the people at
either end - the software/interface actually makes the person at
the other end seem rude - and purely because there is a disparity
between the social engagement one thinks one is engaging in and
the consequence it might have.
The software attempts to compensate for this a little bit. Most of
my friends that are using hip-tops use some kind of status message to
convey that they are mobile - which would work more effectively if you
couldn't easily hide the status message to free up screen real-estate.
In the meantime, the signifiers that actually tell you that someone is
online completely overpower the signals that indicate their
mobility.
So what's the solution? Well ideally - since you're looking at
another form of engagement you'd distinguish it from the more
conventional uses for IM. A separate scrollable container at the
bottom of the screen or another buddy-list (a la the Rendezvous
window) would compensate for some of these impediments - although
probably at the cost of adding in more complexity. Probably the
simplest solution would just be to revisit the particular presence
indicators. In iChat then there might be two options: firstly an
improvement of the portable devices to accurately reflect 'available'
and 'idle', and secondly the creation of a new form of presence to go
alongside 'available', 'idle' and 'busy'. Either would be a useful
corrective feature which could alleviate the social clumsiness of
mobile IM.
Do other people have experiences like these? And if so, how do you
resolve them? Do you leave it to social convention to work through
problems like these, or is a simple UI or technological solution more
simple? Any and all thoughts gratefully received...
Read the comments
Brain parasite caused sea otter deaths
Brain parasite caused sea otter deaths
05/20/2004 11:34 PM
AP via New Jersey Online May 21 2004 4:06AM GMT
Google says MyDoom virus caused problems
Google says MyDoom virus caused problems
07/27/2004 09:29 AM
Hole in blackbox voting caused by
smoking gun
Hole in blackbox voting caused by
smoking gun
09/01/2004 10:15 AM
David Isenberg circulated by email this morning a snip from Bev Harris
at Black Box Voting: The Diebold GEMS central tabulator contains a
stunning security hole Manipulation technique found in the Diebold
central tabulator — 1,000 of these systems are in place, and
they count up to two million votes at a time. By entering a 2-digit
code in a hidden location, a second set of votes is created. This set
of votes can be changed, so that it no longer matches the correct
votes. The voting system will then read the totals from the bogus vote
set. It takes...
Witness: Cancer not caused by chemicals
(SiliconValley.com)
Witness: Cancer not caused by chemicals
(SiliconValley.com)
01/27/2004 07:16 AM
SiliconValley.com - A respected cancer researcher Monday told jurors
in the IBM toxics trial that he believes exposure to clean-room
chemicals did not cause breast cancer in a former employee suing the
computer giant.
Web outage blamed on zombies
Web outage blamed on zombies
06/17/2004 05:12 AM
ZDNet UK Jun 17 2004 9:03AM GMT
Google plays down outage
Google plays down outage
01/06/2005 07:24 AM
News.com.au - Thu Jan 6, 07:08 am GMT
We got heavily effected by this outage
We got heavily effected by this outage
05/05/2004 04:12 AM
On a Wing and a Wiki. When burglars brought down the
Internet link to Ziff-Davis' Manhattan offices, open-source
softwareand Sean Gallagher's personal Web serverkept
eWEEK.com's stories flowing. [eWEEK.com
Messaging and Collaboration]
Woe - this outage effected us!
We're trying to get this system done
for E3 next week and all of a sudden all of the net connections to NYC
are down. Everyone's email is out. Total outage on
infrastructure, servers, data traffic, testing, updates it's all
off-line.
Not a very condusive thing to have happen less than a week from
launch.
:-)
Impact of Outage Minimal
Impact of Outage Minimal
06/17/2004 04:38 PM
“Akamai Technologies (akamai.com) said yesterday that the
“sophisticated, large-scale” denial of service attack it
suffered earlier this week that impacted its naming functionality had
only a minimal impact on its customers.”
Temporary site outage
Temporary site outage
07/23/2004 02:43 PM
Linux.com is being re-launched. For several hours this afternoon,
neither Linux.com, IT Manager's Journal.com nor NewsForge.com will be
visible. We regret the inconvenience, but feel the new Linux.com will
be well worth it!
Comcast's Offer for Outage: $1.43 a Day
Comcast's Offer for Outage: $1.43 a Day
04/15/2005 12:36 PM
After experiencing three nights of network outages in less than a
week, BetaNews has learned that in at least one case in southeast
Michigan, a customer received a credit of $2.86 on their bill to
compensate for the two days of service he complained about.
Net outage strikes Comcast
Net outage strikes Comcast
04/08/2005 12:57 AM
Blog: Comcast, the largest provider of broadband Internet access with
6.5 million customers, suffered a general outage Thursday evening.
...
Grok Description matches for 'Zombie' PCs caused Web outage, Akamai says
GrokA matches for 'Zombie' PCs caused Web outage, Akamai says
'Zombie' PCs caused Web outage, Akamai says