path: root/Documentation/BUG-HUNTING
diff options
authorLinus Torvalds <torvalds@ppc970.osdl.org>2005-04-16 15:20:36 -0700
committerLinus Torvalds <torvalds@ppc970.osdl.org>2005-04-16 15:20:36 -0700
commit1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch)
tree0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation/BUG-HUNTING
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
Diffstat (limited to 'Documentation/BUG-HUNTING')
1 files changed, 92 insertions, 0 deletions
diff --git a/Documentation/BUG-HUNTING b/Documentation/BUG-HUNTING
new file mode 100644
index 000000000000..ca29242dbc38
--- /dev/null
+++ b/Documentation/BUG-HUNTING
@@ -0,0 +1,92 @@
+[Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)]
+This is how to track down a bug if you know nothing about kernel hacking.
+It's a brute force approach but it works pretty well.
+You need:
+ . A reproducible bug - it has to happen predictably (sorry)
+ . All the kernel tar files from a revision that worked to the
+ revision that doesn't
+You will then do:
+ . Rebuild a revision that you believe works, install, and verify that.
+ . Do a binary search over the kernels to figure out which one
+ introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but
+ you know that 1.3.69 does. Pick a kernel in the middle and build
+ that, like 1.3.50. Build & test; if it works, pick the mid point
+ between .50 and .69, else the mid point between .28 and .50.
+ . You'll narrow it down to the kernel that introduced the bug. You
+ can probably do better than this but it gets tricky.
+ . Narrow it down to a subdirectory
+ - Copy kernel that works into "test". Let's say that 3.62 works,
+ but 3.63 doesn't. So you diff -r those two kernels and come
+ up with a list of directories that changed. For each of those
+ directories:
+ Copy the non-working directory next to the working directory
+ as "dir.63".
+ One directory at time, try moving the working directory to
+ "dir.62" and mv dir.63 dir"time, try
+ mv dir dir.62
+ mv dir.63 dir
+ find dir -name '*.[oa]' -print | xargs rm -f
+ And then rebuild and retest. Assuming that all related
+ changes were contained in the sub directory, this should
+ isolate the change to a directory.
+ Problems: changes in header files may have occurred; I've
+ found in my case that they were self explanatory - you may
+ or may not want to give up when that happens.
+ . Narrow it down to a file
+ - You can apply the same technique to each file in the directory,
+ hoping that the changes in that file are self contained.
+ . Narrow it down to a routine
+ - You can take the old file and the new file and manually create
+ a merged file that has
+ #ifdef VER62
+ routine()
+ {
+ ...
+ }
+ #else
+ routine()
+ {
+ ...
+ }
+ #endif
+ And then walk through that file, one routine at a time and
+ prefix it with
+ #define VER62
+ /* both routines here */
+ #undef VER62
+ Then recompile, retest, move the ifdefs until you find the one
+ that makes the difference.
+Finally, you take all the info that you have, kernel revisions, bug
+description, the extent to which you have narrowed it down, and pass
+that off to whomever you believe is the maintainer of that section.
+A post to linux.dev.kernel isn't such a bad idea if you've done some
+work to narrow it down.
+If you get it down to a routine, you'll probably get a fix in 24 hours.
+My apologies to Linus and the other kernel hackers for describing this
+brute force approach, it's hardly what a kernel hacker would do. However,
+it does work and it lets non-hackers help fix bugs. And it is cool
+because Linux snapshots will let you do this - something that you can't
+do with vendor supplied releases.