tsunami

log in
history

googlebot and tsunami http 500

Luke Breuer
2008-03-27 04:47 UTC

the problem
For a while, tsunami was responding with HTTP 500 to googlebot, since it forces .NET into cookieless mode, which breaks with the URL rewriting I do with tsunami (an analysis). I fixed that error at the end of January, 2008, but some pages have yet to be re-indexed by googlebot.
pages that have not yet been re-visited
uri_stem                                                                         count       min_visited             max_visited             count_500   count_non_500
-------------------------------------------------------------------------------- ----------- ----------------------- ----------------------- ----------- -------------
/time/item/ASP.NET_Authentication_and_Authorization/51.aspx                      10          2008-01-10 14:47:59.000 2008-01-27 04:34:59.000 10          0
/time/item/Extentions_to_tag_cloud_systems/239.aspx                              10          2008-01-12 19:06:09.000 2008-01-25 07:47:24.000 10          0
/time/item/C_3.0_ZeroWidthSplit/221.aspx                                         11          2008-01-07 04:14:52.000 2008-01-16 21:33:08.000 11          0
/time/item/IPrincipal_~_ASP.NET/110.aspx                                         7           2008-01-09 23:22:16.000 2008-01-16 15:27:17.000 7           0
/time/item/C_20_functional_code/242.aspx                                         3           2008-01-15 12:22:28.000 2008-01-16 03:25:35.000 3           0
/time/item/ASP.NET_Authentication_and_Authorization_~_ASP.NET/107.aspx           6           2008-01-09 23:44:08.000 2008-01-15 06:36:10.000 6           0
/time/item/javascript_window.open/19.aspx                                        13          2008-01-03 14:37:52.000 2008-01-15 06:34:39.000 13          0
/time/item/ASP.NET_Gotchas_~_ASP.NET_Gotcha/75.aspx                              7           2008-01-09 23:21:03.000 2008-01-15 06:17:49.000 7           0
/time/item/Thoughts_on_Static_vs._Dynamic_Typing/179.aspx                        8           2008-01-08 15:59:16.000 2008-01-13 10:48:29.000 8           0
/time/item/javascript_window.onbeforeunload/21.aspx                              13          2008-01-03 14:24:52.000 2008-01-13 00:04:51.000 13          0
/time/item/ASP.NET_Gotchas/8.aspx                                                13          2008-01-03 07:09:27.000 2008-01-12 23:53:14.000 13          0
/time/item/Performance_Counters_in_ASP.NET_~_ASP.NET_Gotcha/73.aspx              3           2008-01-09 23:42:18.000 2008-01-11 02:01:01.000 3           0
/time/item/ASP.NET_Gotcha_~_Programming_Gotcha/72.aspx                           3           2008-01-09 23:33:41.000 2008-01-11 01:44:02.000 3           0
/time/item/TIME_Item_Names_~_ASP.NET_Questions/106.aspx                          3           2008-01-09 23:35:35.000 2008-01-11 01:31:06.000 3           0
/time/item/ASP.NET_Authentication_and_Authorization_~_Authorization/113.aspx     3           2008-01-09 23:46:59.000 2008-01-11 01:30:56.000 3           0
/time/item/ASP.NET_Questions_~_ASP.NET/101.aspx                                  3           2008-01-09 23:27:01.000 2008-01-10 18:40:44.000 3           0
/time/item/C_TrimLiteral/211.aspx                                                5           2008-01-06 01:25:41.000 2008-01-10 05:42:09.000 5           0
/time/item/Performance_Counters_in_ASP.NET/30.aspx                               5           2008-01-04 16:36:06.000 2008-01-09 03:42:24.000 5           0
/time/item/C_3.0_Syntax_reduction/212.aspx                                       3           2008-01-05 22:12:53.000 2008-01-08 08:50:42.000 3           0
/time/item/C_3.0_SelfJoinByOffset/220.aspx                                       3           2008-01-07 06:26:17.000 2008-01-08 07:22:53.000 3           0
/time/item/SQL2005_Reset_default_schema_for_users/195.aspx                       11          2008-01-02 00:40:29.000 2008-01-08 03:35:44.000 11          0
/time/item/javascript_window.close/17.aspx                                       4           2008-01-05 12:09:38.000 2008-01-08 01:08:48.000 4           0
/time/item/SQL2005_Script_indexes/192.aspx                                       6           2008-01-03 14:51:14.000 2008-01-06 16:26:19.000 6           0
/time/item/SQL2000_effective_dating/214.aspx                                     3           2008-01-04 10:57:00.000 2008-01-05 07:05:09.000 3           0
/time/item/SQL2005_Duplicates_analysis/215.aspx                                  2           2008-01-04 11:02:44.000 2008-01-05 01:29:05.000 2           0
/time/item/ADO.NET_Gotchas/158.aspx                                              5           2008-01-02 18:18:00.000 2008-01-05 01:10:39.000 5           0
SQL
select  uri_stem = cast(uri_stem as varchar(80)), 
        count = count(*), 
        min_visited = min(date_time), 
        max_visited = max(date_time), 
        count_500 = sum(case when status_code = 500 then 1 else 0 end), 
        count_non_500 = count(*) - sum(case when status_code = 500 then 1 else 0 end)
from    zs_IIS_Log
where   uri_stem like '/time/%' and
        user_agent like ('%googlebot%')
    and uri_stem not like '%/edit.aspx'
    --and   uri_stem like '%bcp%'
group by uri_stem
having count(*) - sum(case when status_code = 500 then 1 else 0 end) = 0
order by /*count(*) - sum(case when status_code = 500 then 1 else 0 end),*/ max(date_time) desc