Ladebug FAQ

To send mail to the maintainer: ladebug support

Check out these debugging tips:

Using the debugBreak function to activate the debugger
Using the history tool to view the execution history of a program

Index

What is Ladebug?

Ladebug is a debugger, very much like dbx or gdb; it has a command-line mode and a GUI.

The command language is similiar to dbx's.

Ladebug runs on HP Tru64 UNIX and Linux for HP Alpha systems. It is available for free: kits can be downloaded over the Net.

The Linux version of Ladebug is still in development, and not all of Ladebug's capabilities are available there; see the README file in the Linux kits for the current list of unsupported capabilities.

What versions of Ladebug are there?

Ladebug has separate versions for each major release of the operating system for best performance, but is generally upward-compatible. It is not generally downward-compatible.

On Tru64 UNIX, there are versions for v4.0 and v5.0. The v4.0 version does not support kernel debugging on v5.0, but otherwise will run fine.

On Linux, Ladebug versions are generally upward-compatible.

How do I get a kit?

Kits for Tru64 UNIX

The latest version is available on the internet. There are v4.0 and v5.0 versions of this kit, both containing the GUI. The v4.0 version of the kit will install and run on v5.0 systems, but the v5.0 kit provides better kernel debugging support for v5.0 (and higher) systems.

Ladebug now has a built-in GUI, replacing the previous dxladebug GUI.

The Ladebug provided with the base OS (i.e. the one which comes with the operating system) doesn't include the GUI in some OS versions due to space limitiations on the CDROM. To get a Ladebug with the GUI in that case, download a Ladebug of the appropriate type.

Customers can get a kit over the Internet from either:

The Developers' Toolkit Supplement for Tru64 UNIX website, where the latest kit delivered to the Toolkit can be found, or
The Ladebug debugger web site, where the latest kit can be found.

We (the Ladebug team) prefer you to use the latter - a lot of testing goes in before we declare it to be "field-test-quality"!

Kits for Alpha Linux

See the Fortran for Alpha Linux or the Ladebug debugger web site.

Is there a hard-copy manual?

There is no hard-copy manual. Ladebug's manual exists only as HTML source, to support searching and rapid updating.

If you need hard-copy, most browsers will let you print the current page; for Ladebug, the current page is the whole manual. The manual's formatting has been made flexible as part of being a web document, so it doesn't depend on page breaks or page widths.

Having the whole manual be one page has drawbacks--it can take a long time to load. Breaking it up has drawbacks, too--you can't search it as easily, and it complicates internal links. Please tell us whether you'd prefer us to break the manual into a set of HTML chapters, or like it this way!

What other development tools do you recommend?

Visual Threads - a unique and fully visual tool to analyze and debug multithreaded programs.

Developers' Toolkit for Tru64 UNIX - a prerequisite for programming under Tru64 UNIX, the kit includes the Compaq C compiler, and application debugging, profiling, analysis, reordering, and porting tools.

ATOM - a toolkit from the Developer's Toolkit above, but worth listing on its own! It lets you modify an application so that it will perform run-time analysis on itself. We use ATOM to build the history tool, for example.

What can I do to prepare my application for debugging?

There are things you can do to your program to make it easier to debug that go beyond using -g on the compile line:

You can write data-structure dumping routines which you can call from the debugger command line. For example, if you have a linked list, your dumper could print out the data in a more compact and readable form than Ladebug could.
If your program uses free and malloc, you can wrap the calls to those routines inside your own memory getting and freeing routines. This will let you add checking or memory-optimization code later, if you want. But for debugging it will also let you turn off free-ing completely.
If you have a bug, and when you comment out the free in your wrapper routine it goes away, you've just found out that your bug is due to a memory-usage error like using a pointer to deleted storage or reading uninitialized data. Wrapper routines will also give you a point to set a breakpoint on to set a watch on a particular chunk of memory.
If you know that the bad access is a write to a particular memory address, for example, you can set a temporary breakpoint in the malloc wrapper to stop when the block containing that address is allocated, and then set a watchpoint on that memory.
You can add in checking code (such as C's assert package), which can verify assumptions in your code. For example, you can add in checks that pointers are not null, that lists are not circular, and that things which should be sorted are sorted.
- Use of #ifdef would allow this code to be included only in the development version of your application.
- You can also control the checking with an environment variable or a global variable (or several variables!).
- If all the checking code calls a single failure-reporting routine, you can set a breakpoint on that routine when you run under the debugger.
You can maintain a phase or state variable which your code keeps up to date. Then you could set a watchpoint on that variable and stop the application when it enters a certain phase or state. Similarly, adding counts to lists and depth indicators to trees will give you more ways to set breakpoints closer to your bug.
You can use a memory-management tool (such as Third Degree), which checks your use of heap memory. Many bugs are related to references through dangling pointers or to uninitialized variables.
You can add logging to your code, printing out significant steps and values as the program runs.
- Use of #ifdef will allow you to keep the logging overhead out of the production version
- You can also control the logging with an environment variable or a global variable (or several variables!).
- When you have a bug, you can turn on the appropriate logging flags which may help you understand how to get closer to the bug before you start debugging.

Ladebug itself uses logging controlled by environment variables, self-dumping data-structures and thousands of ASSERTs. We have found these very helpful in finding and fixing bugs.

All I want to do is add a `printf`!

Ladebug can be used to add printfs to your code without the effort of recompilation and relinking.

Here's an example, showing how to add a run-time printf at line 10 of the file filename.C in the program myprog; user commands to Ladebug are bold:

    $ ladebug myprog
    Welcome to the Ladebug Debugger Version ...
    object file name: /usr/users/aperson/myprog
    Reading symbolic information ...done
    (ladebug) file filename.C
    (ladebug) when at 10 { printf "Value of x is 0x%x, of i is %d\n", x, i }
    (ladebug) run

        ... program runs normally ...

    Value of x is 0x1234, of i is 23

        ... and you get the printf output!

The major differences are that the Ladebug command printf doesn't need parentheses or a trailing semicolon, and you can add and modify the Ladebug printf without recompiling your application.

Since the debugger is involved, the Ladebug printf is slightly slower than recompiled code would be, but it is more flexible.

How can I get my program to start Ladebug?

debugBreak is a function that can be added to any program.

When the debugBreak function is called, it creates a Ladebug debugger process initialized so that the debugger can connect to the program, using the attach function of Ladebug.

The debugger winds up ready for input from the user, as though there had been a breakpoint at the end of the call to the debugBreak function.

debugBreak is described in more detail in its own note.

What was my program doing just before it got lost?
or
How can I see a history of my program's execution?

The history tool can be used with almost any program. It modifies your original program so that the program will record a history of the source lines it executes and the calls it makes.

The history tool is currently a unsupported prototype, and may change. It is available in source form; the buffer size and other factors can be changed by editing the source. It depends on ATOM, and will not work if ATOM is not available on your platform.

The history tool is described in more detail in its own note.

I hate having an empty line mean repeat the previous command

In interactive sessions, Ladebug repeats the previous command when you just press the return key again. For many users, this is a convenience feature, and they like the ability to step through a long sequence with one key-press per step.

But some don't, and there's a way to make interative sessions work like Ladebug command scripts, and have blank commands be no-ops: set the $repeatmode variable at the command line or in a .dbxinit script:

    (ladebug) p 12
    12
    (ladebug) 
    12
    (ladebug) set $repeatmode = 0
    (ladebug) p 13
    13
    (ladebug) 
    (ladebug)

I hate the garbage characters for line-editing!

If you capture Ladebug's output in a way that Ladebug doesn't notice (e.g. by invocation in an emacs buffer), you can see that Ladebug issues lots of escape characters to support line editing.

Unfortunately, that's about all you can see, because the extra junk pretty much obscures the actual input or output. It might look like this (the original command typed in is print 5 + 2):

    ^[[Jp^[[Jr^[[Ji^[[Jn^[[Jt^[[J ^[[J5^[[J ^[[J+^[[J ^[[J2

If you're willing to live without line-editing support, you can get rid of this using a control variable, which can be set either at the command line or in a .dbxinit file:

    (ladebug) set $editline = 0

I hate the command language!

Some users prefer gdb's command language. We can't help there.

But with aliases you have a limited ability to tailor the command language to suit you better. For example, if you want ret to mean ret and pp to mean print *, you can write a .dbxinit file which defines them that way:

    alias ret "return"
    alias pp  "print *"

I hate the way Ladebug is so verbose!

Some people like lots of messages and some don't. There are several controls to make Ladebug much less noisy. Set them as indicated if you like a terse debugger:

    set $doverbosehelp = 0
    set $giveladebughints = 0
    set $showlineonstartup = 0
    set $showwelcomemsg = 0
    set $stackargs = 0
    set $statusargs = 0

See the manual for details.

I hate the way the links in the manual sometimes don't work!

The manual is a large file, often taking several seconds to load. If you click on a link before the file is finished loading, the browser may not find the target of the link. To correct this, check the status bar at the bottom of the browser to make sure the file is done loading, then try the link again.

I hate the way the `edit` command stops the debugger!

Since Ladebug doesn't check the DISPLAY environment variable or the value of the EDITOR environment variable, it can't predict whether you'll be editing in the same window as the debugger or not. For example, if DISPLAY had no definition, and EDITOR were emacs, the edit command will cause emacs to take over the current terminal window, while if DISPLAY were defined to a particular node, emacs would come up in a different window.

To prevent contention for the command-line window, Ladebug pauses while the editor is running, and you can only start issuing Ladebug commands again when you've exited the editor.

If you have set DISPLAY in such a way that the editing window will be a different window, and want to continue giving Ladebug commands while you continue editing, you can do this using the sh command and the & asynchronous command terminator. Here's an example showing how to do it for emacs:

    (ladebug) sh emacs /usr/users/me/foo.c &

To find out the path to the current file, print $curfilepath. Unfortunately, you'll have to cut and paste the result into the sh command line, as shell commands are passed to the shell uninterpreted (i.e. for "sh emacs $curfilepath &" Ladebug won't evaluate $curfilepath, but the shell will try to!).

Ladebug hangs when I debug my application!

If your application does something wrong, you might take the understandable shortcut of recalling the command and prepending ladebug to debug it.

    $ foobar < foobar.input > foobar.output
    *BANG*

    $ ladebug foobar < foobar.input > foobar.output

But on this command line, ladebug is the main application, and it will read from foobar.input and write to foobar.output, thus ignoring the input from your terminal.

Use the run command to add any parameters and IO redirection:

    $ ladebug foobar
         :
    (ladebug) run  < foobar.input > foobar.output

This will redirect your applications IO, but not Ladebug's. To redirect Ladebug's IO, see the manual section for record. Note: The manual takes a few seconds to load.

Ladebug hangs when I do a `run`!

The rerun command repeats the arguments and any file redirections from your first run command, as does r, which is an alias for rerun.

A run command will not repeat the arguments. This means that your application will probably attempt to read from stdin, which is probably the terminal you invoked Ladebug from.

If you are using the command-line interface, this may look like a failure to get a response from Ladebug, if your application does not hit a breakpoint before it reads from stdin. Because the run command does not produce any output, this can look like a hang. To get back to the Ladebug command line, enter Ctrl/C.

Why can't I see the commands I am typing in?

Because Ladebug offers support for editing command lines, it queries the terminal for the number of lines and columns. If the actual terminal size is different than the answers Ladebug gets, then Ladebug may not correctly display your input. The most common symptom is not being able to see the command as you type it in.

If this happens, check that you have correctly set the size information for your terminal at the shell-command level.

For example, if your terminal is 47x80:

    % stty rows 47 ; setenv LINES 47
    % stty cols 80 ; setenv COLS 80

Another way to ensure that your terminal is correctly sized is to use the resize command at the shell prompt. See the resize man page for details.

What does `opaque` mean?

The debugger sometimes prints out:

    Information: An <opaque> type was presented during execution of the
    previous command.

    For complete type information on this symbol, recompilation of this program
    will be necessary.

This happens when it can find the struct, union, or class identifier declaration in the debugging information, but can find no information about the identifier's definition.

This can happen for two reasons:

You compiled only some of your sources "-g", and those sources contained the struct as an incomplete type: struct S; Other sources containing the struct definition struct S { ... } were compiled without the -g, or didn't even exist.
The C++ compiler's heuristics for reducing the repetition of struct information resulted in a similar effect. See the documentation of the cxx -gall in the cxx man page and release notes for more information.

The debugger outputs `Assertion failed` and a traceback. What do I do?

The debugger does lots of internal checks, and also checks on the executable files and shared libraries that it is loading.

If these checks show an internal inconsistency, the assertion gets triggered. Please report the problem to us so we can fix it.

In many cases, Ladebug can recover and you will be able to continue debugging. In severe cases, when it cannot recover, Ladebug will exit.

We want Ladebug to be usable and robust: if you find an error, please tell us!

How do I use Emacs with Ladebug?

You can control your debugger process entirely through the Emacs GUD (Grand Unified Debugger) buffer mode, which is a variant of shell mode. All the Ladebug commands are available and you can use the shell mode history commands to repeat them.

Ladebug version 4.0-48 and higher support GNU Emacs Version 19 and higher.
Ladebug version 4.0-58 and higher support Lucid XEmacs Version 19.14 and higher.

The information that follows assumes you are familiar with Emacs and are using the Emacs notation for naming keys and key sequences.

For each Emacs session, before you can invoke the debugger, you must load the Ladebug-specific Emacs LISP code, as follows:

M-x load-file

At the Load file: prompt, type:

/usr/lib/emacs/lisp/ladebug.el

You can also place a load-file call in your Emacs initialization file (~/.emacs). For example:

(load-file "/usr/lib/emacs/lisp/ladebug.el")

To start the debugger with Emacs, type:

M-x ladebug

If you use symbolic links to access your sources, and/or if you use NFS-mounted file systems, you probably want to adjust the Emacs variable directory-abbrev-alist in your Emacs initialization file (~/.emacs).

For example, if your home directory is /usr/users/fred, its name may show up as /tmp_mnt/var/users/fred because of NFS mounting, and Emacs may shorten this to /var/users/fred. This can cause confusion if you load the file ~/src/prog.c into Emacs (which gets expanded as /usr/users/fred/src/prog.c).

Later, the debugger will tell Emacs to load /var/users/fred/src/prog.c. You now have two different buffers (likely named "prog.c" and "prog.c<2>") trying to contain the same file but under different full names (/usr/... and /var/...). You will probably use the first, the debugging session the second. The solution is to put something like what follows in your Emacs initialization file:

(setq directory-abbrev-alist
      (cons (cons "^/var/users/" "/usr/users/")
            directory-abbrev-alist))

This will cause Emacs to translate filenames beginning with /var/users/ to be /usr/users/ instead, which will then find the file under the name you would expect. (The above assumes all usages of of /var/users/ should be redirected to /usr/users/.) A little experimentation will reveal what translations you need for your environment. For more information on directory-abbrev-alist, do


    m-X describe-variable [Return] directory-abbrev-alist

from within Emacs.

For more details, see the manual. Note: The manual takes a few seconds to load.

How do I stop in the main Fortran program?

When a Unix application is started, the system routine __start calls a routine named main. Since Fortran programs may not have such a routine, the Fortran RTL includes a file (for_main.c) which has a main which will call your main program unit.

This explains why you stop in for_main.c if you've followed the C habit of using "stop in main" as your first Ladebug command. If you do stop there, you can do a few step commands, and follow the call to Fortran code.

The simplest way to get to your 'main' program is for you to just set a breakpoint on it by name:

    (ladebug) list 1:2
          1         program for_test
          2         logical*1 b
    (ladebug) stop in for_test
    [#1: stop in subroutine for_test() ]
    (ladebug) r
    [1] stopped at [subroutine for_test():7 0x120001554]
          7 10      b = 1

How do I examine Fortran arrays?

Arrays of strings are incorrectly understood as though the array had an extra left-most dimension indexing the individual characters, rather as though it were an array-of-array-of-character.

For this reason, the debugger displays the following error message when you try to use correct Fortran syntax:

   (ladebug) print a(1)(:1)
    print a(1)(:1)
	      ^
    Unable to parse input as legal command or Fortran expression.

Also for this reason, the debugger accepts expressions that Fortran would not, for example:

    (ladebug) print a(1:2,2:3)
    (2) "ef"
    (3) "ij"

You should use this second form to deal with arrays of strings.

How do I examine items in Fortran Modules?

If you are using Ladebug V66 or earlier, there is no support for Fortran Modules: you can't set your scope to a module, or print things by their name within a module.

But because the items do exist, they have "mangled" names in the linker table, and you can print them using the mangled name.

An item named an_item in a module named a_module will have the mangled name $a_module$an_item

    (ladebug) p $a_module$an_item  
    42

If you are using Ladebug V67 or V68 and your object file is compiled by a Fortran compiler that emits module debug information, then you can use the rescoping syntax to refer to a module component. For example:

    (ladebug) p `a_module`an_item  
    42

If you are using Ladebug V69 or later and Fortran version 5.5A or later, module components which are visible to the code are visible to Ladebug and can be referenced without the rescoping:

    (ladebug) p an_item  
    42

Components in a module not made visible by a USE statement will still require the rescoping syntax to specify the module they are in, of course.

The modules themselves can be printed, as well:

    (ladebug) print MIM_A
    module MIM_A
      use MIM_B, only:
        BY = 3
        BX = 2
      AX = 1
    end module MIM_A

How do I set a temporary or counted breakpoint?

Ladebug lets you construct temporary or counted breakpoints using debugger variables and if clauses:

Temporary breakpoint
Use $curevent and have the breakpoint's action list delete itself:
```
    (ladebug) stop in f { delete $curevent }
```

Counted breakpoint

Use a debugger variable to count breakpoint hits you want to skip, and a helper breakpoint to increment the count:

    (ladebug) set $bpt_count = 0;
    (ladebug) when at main { set $bpt_count = 0 }
    (ladebug) when at 12 { set $bpt_count = $bpt_count + 1 }
    (ladebug) stop at 12 if $bpt_count > 10

If your current language is C or C++, you can use $bpt_count++.

'Until' breakpoint

This is much like the counted breakpoint, with a different condition:

    (ladebug) set $bpt_count = 0;
    (ladebug) stop at 12 if $bpt_count < 10 { set $bpt_count = $bpt_count + 1 }

Combinations
One of the strengths of the Ladebug approach is that you can construct complex breakpoints using these mechanisms. Here's a breakpoint which triggers on the tenth and eleventh times and then deletes itself:
```
    (ladebug) set $bpt_count = 0
    (ladebug) when at 59 { set $bpt_count = $bpt_count + 1 }
```
Note the number of that breakpoint (call it 1). (ladebug) stop at 59 if $bpt_count >= 10 (ladebug) when at 59 if $bpt_count == 11 { disable 1, 2, 3} The breakpoints are disabled rather than deleted so that the same setup can be used on a rerun.

How do I set a conditional breakpoint using a variable in the breakpoint's define-time scope?

Breakpoint conditionals in Ladebug are evaluated in the scope in which the breakpoint triggers, not in the scope in which you define the breakpoint.

If you have variables with the same names in different scopes, and you wish to set a breakpoint that triggers based on the value of a variable in the current scope, use a debugger variable.

By setting the debugger variable to the address(&) of the test variable, you can then set your conditional breakpoint based on the value of the dereferenced debugger variable.

For example, consider the following example:

(ladebug) list 1
      1 #include <stdio.h>
      2 
      3 int x = 0;
      4 int sum = 0;
      5 
      6 void bar(int x_from_main)
      7 {
      8     int x = 5;                    /* Inner 'x' */
      9     sum += x_from_main%x;
     10     printf("x in bar = %d\n",x);
     11 }
     12 
     13 void main()
     14 {
     15     for (x = 1; x < 10; x++)      /* Outer 'x' */
     16     {
     17         printf("x in main = %d, ",x);
     18         bar(x);
     19     }
     20     printf("final sum = %d\n",sum);
     21 }
(ladebug) stop at 18
[#1: stop at "main.C":18 ]
(ladebug) run
[1] stopped at [void main(void):18 0x1200018c8]	
     18         bar(x);
(ladebug) print x
1

If you created a breakpoint conditional on x, that would test the inner x:

    stop in bar if x > 5

To stop in bar only when the outer x becomes greater than 5, not the x in bar, create a debugger variable $mainX and use it to point to the x in main:

(ladebug) print $mainX
Symbol "$mainX" is not defined.
(ladebug) set $mainX = &x
(ladebug) print *$mainX
1
(ladebug) stop at 9 if *$mainX > 5
[#3: stop at "main.C":9 if *$mainX > 5 ]
(ladebug) disable 1
(ladebug) cont
x in main = 1, x in bar = 5
x in main = 2, x in bar = 5
x in main = 3, x in bar = 5
x in main = 4, x in bar = 5
x in main = 5, x in bar = 5
[3] stopped at [void bar(int):9 0x12000182c]	
      9     sum += x_from_main%x;
(ladebug) up
>1  0x1200018d0 in main() "main.C":18
     18         bar(x);
(ladebug) print x
6

Note that the define-time variable must have a lifetime which includes the evaluation of the conditional. If the variable has been deleted or is a local variable in a routine no longer on the stack, then the value that is fetched through the pointer in the conditional is undefined, and will probably be wrong.

How can I get to 'just before' where my application causes a SEGV?

If your application gets a SIGSEGV or other signal in a routine which is called many times, it might help to know which call is the one to cause the problem.

There is an ATOM-based tool "counter" which will instrument your program and count calls. When a signal is raised, it can calculate the correct Ladebug commands to use to get you to "just before" the event which causes the signal.

You might also want to consider instrumenting the program with this tool for ordinary testing or daily use. The run-time overhead is relatively low compared to the ring tool (there's a slowdown of about 50%), and it can provide you with information which could otherwise take some time to figure out.

Users could combine this tool with debugBreak (or the stack-dumping code at this location in the Ladebug manual). Note: The manual takes a few seconds to load.

Such a combination could cause the debugger to come up on signals, or to dump a stack when a signal occurred.

This tool is currently a unsupported prototype, and may change. It is available in source form, so that users may customize it by editing the source.

"Counter" is described in more detail in its own note.

How can I debug a stripped binary?

When you strip a binary with strip or ostrip, any debugging information is removed. Ladebug can't show you names, variables, or sources.

If your stripped code crashes and your customer sends you the traceback or core file, you'll only have hex addresses and machine instructions to work with.

There following are possible workarounds:

You can keep an unstripped version of your application around, and debug that.
This requires having both versions of the application at the customer site, or having a reproducer of the problem at your site. If at the customer site, Ladebug still won't be able to show you your sources, unless the remote site also has your source files.
You can keep an unstripped version of your target around, and load it into the debugger locally while you debug the customer's problem remotely in another window.
Now you can use the debuggable version to translate routine names to addresses and vice versa, using commands like p &main, and setting breakpoints at the hex addresses using stopi at.
A big advantage of this approach is that you can see sources on your side. A disadvantage is that the remote side doesn't have any type information, so you have to dead-reckon through data structures and it's easy to make mistakes.

You can use ostrip and produce an stb file, which contains the debugging information, and ship it with your application to the customer site. There you can use that file to re-create an unstripped executable and debug with it.

There's no real difference between this approach and shipping a debuggable version, but it does allow the customer to store the stb file off to the side and run the smaller stripped file most of the time.

One complication here is that there was a bug in ostrip in version V4.0 of Tru64 UNIX. But there is a workaround!

On V5.0 and higher, use this sequence:

    $ ostrip -t your-executable  # Produces .stb file
    
        # Ship your-executable and your-executable.stb
        # to the customer.  When you wish to debug remotely,
	# get the stb file, put it in the same directory 
	# as the executable and re-join with this command:

    $ ostrip -j your-executable  # Join them

        # your-executable now has debugging information

On V4.0 systems, use this sequence:

    $ ostrip -c your-executable  # Adjusts for work-around
    $ ostrip -t your-executable  # Produces .stb file
    
        # Ship your-executable and your-executable.stb
        # to the customer.  When you wish to debug remotely,
	# get the stb file, put it in the same directory 
	# as the executable and re-join with this command:

    $ ostrip -j your-executable  # Join them

        # your-executable now has debugging information

How do I relate a raw PC to my application code?

If you have obtained a PC (program counter) value and you want to find out where that PC occurs in your source code, use the Memory Display Commands to view that PC address as an assembly instruction. Note that Ladebug maintains the current value of the PC in $curpc.

(ladebug) print $curpc
0x120000b98
(ladebug) 0x120000b98/i
int main(void): test.c
*[line 12, 0x120000b98]	bis	r31, 7, r1

The first command printed the contents of $curpc, to obtain a PC value. The second command is a memory display command, and the "i" means to format the contents of memory at the given address into an assembly instruction.

The first line of output tells you the function name int main(void) and source file test.c that the PC occurs in. The second line of output tells you that this is the current location of the debuggee (the *), that we're at line 12 in the source file, the correct PC value for this line, and the assembly instruction.

How can I undo a Ladebug command?

It happens all the time: you were closing in on the bug, but then you did one next or cont too many and you're past the point where the bug happens. If only you could undo it!

The short how-to is like this:

    (ladebug) save snapshot
    # 1 saved ....
    (ladebug) cont

... Oops! It went past the bug!

    (ladebug) clone snapshot

... Now back before the "cont".

For a fuller explanation, see the manual, which explains how to use snapshots as an undo mechanism here. Please note that the manual make take some time to download!

How do I send a signal to my application?

Ladebug clears all signals when it starts, steps or otherwise continues the application, because it expects that the signal which stopped the application is one caused by Ladebug (e.g. the SIGINT raised by a breakpoint instruction).

If your application handles signals, and the signal is not one put in there by Ladebug, but one you want your application to handle, then this can be a problem, because Ladebug's normal signal-clearing action will have caused your signal to disappear.

The sequence of events is:

Signal is raised
Ladebug catches it, gives you command
You type cont
Ladebug clears the signal and continues the application
So the signal is lost!

To send the lost signal (or any other signal) to your application, use the cont <signal-value> form of the cont command:

    (ladebug) cont SIGSEGV

    (ladebug) cont 11

As previously mentioned, by default Ladebug will stop when encountering a signal that it thinks it should handle (even if you've defined a handler for it in your program). You can prevent Ladebug from handling the signal (thus allowing your defined handler to handle it), by using the ignore <signal-value> form of the ignore command. For example, if you have a SIGSEGV handler defined, you would:

    (ladebug) ignore SEGV

    (ladebug) ignore 11

To see the list of signals that Ladebug expects to handle, use the catch command (with no arguments). To see the list of signals that Ladebug expects the program to handle, use the ignore command (with no arguments). A target program must be loaded to use these commands. For example:

    $ ladebug your-executable
    (ladebug) catch
    INT, QUIT, ILL, TRAP, ABRT, EMT, FPE, BUS, SEGV, SYS, PIPE, TERM, URG, STOP,
    TTIN, TTOU, XCPU, XFSZ, PROF, USR1, USR2, VTALRM, RTMIN, RTMIN1, RTMIN2,
    RTMIN3, RTMIN4, RTMIN5, RTMIN6, RTMIN7, RTMAX, RTMAX7, RTMAX6, RTMAX5, 
    RTMAX4, RTMAX3, RTMAX2, RTMAX1
    (ladebug) ignore
    HUP, KILL, ALRM, TSTP, CONT, CHLD, WINCH, IO

We recommend against ignoring SIGINT. It will lead to problems because Ladebug would not be able to see any breakpoints, including the temporary ones put in for step, next, stepi and nexti.

Ladebug crashes with `socket error` when I type `n` to a `More` prompt during a parallel debugging session!

This is a known bug in V66. To get around the socket error, enter the following command, either in a script or at the Ladebug prompt. It will turn off paging.

    set $page = 0

In V67, there is still a bug, but Ladebug now includes an internal workaround, so you don't need to set $page. Ladebug now treats an n response to the More prompt as if it were not an n. The listing thus continues.

We are sorry we did not have time to fix this bug.

How do I debug in a library opened with `dlopen`?

Debugging inside shared libraries opened with dlopen() is possible, but it can be tricky. Ordinary shared libraries linked into the application work just fine because Ladebug knows about them when the program loads, and can read the debug information for the library and set breakpoints using that debug information.

But Ladebug doesn't know which library is going to be opened by a dlopen call, and so can't see the library or read its debug information. It will automatically load the debug information when the library is actually opened, so the breakpoint can be set at that time.

The trick is to stop at the right point in the library opening sequence, and then print out the name of the file being opened. At that point, Ladebug will have loaded the file's debug information and you can set breakpoints in the library.

This example prints the name of the library so that the user could just continue if this were the wrong one rather than returning.

    (ladebug) stop in foobar # Won't work before the library is loaded
    Symbol "foobar" is not defined.
    foobar has no valid breakpoint address
    Warning: Breakpoint not set
    (ladebug) stop in __dlopen { print (char*)$a0 }
    (ladebug) r
    [1] stopped at [<opaque> __dlopen(...) 0x3ff800dc4cc]   
    0x140000010="./library.so"
    (ladebug) return         # Finishes the load. Ladebug has read the debug information.
    (ladebug) stop in foobar # Can now set your breakpoint

You only need to set the breakpoint once per session; on a re-run, Ladebug will correctly re-install the breakpoint, and you don't have to stop at __dlopen again.

This command does it all in one step, if you know the library of interest is the first one to be opened:

    when in __dlopen { disable $curevent; return; stop in foobar }

How do I use a Ladebug $variable in a shell command?

If you want to use a Ladebug debugger variable in a shell command, you can't just put it in the command, as Ladebug doesn't look at the body of a shell command.

There is a trick you can use using the record feature. The example shows how to do this, moving the value of $hexints into a file which can be used in the shell command echo:

    (ladebug) record output foobar
    (ladebug) p "ME", $hexints
    ME 0
    (ladebug) unrecord output
    (ladebug) sh grep ME foobar | awk '{print $3}' > my_pid
    (ladebug) sh echo `cat my_pid`
    0

The grep is there to get rid of extraneous lines (in this case the prompt for the line with unrecord), and the awk selects the desired output from a line containing the prompt, "ME" and the zero.

NOTE:

If you look at the generated file foobar, you will see that it ends with "(ladebug) " without a trailing new-line. So if you do sh cat foobar there will an apparent double prompt: the last line of foobar and the actual new prompt.

How do I use a shell variable in Ladebug?

As in exporting Ladebug variables, you have to make a trip through a file. Here's one method, where you construct a file of Ladebug commands and then use the source command to execute it.

The example shows importing the shell environment variable SHELL into Ladebug as the Ladebug variable $myshell.

    (ladebug) sh echo "set \$myshell = \"$SHELL\"" > my.cmd
    (ladebug) sh cat my.cmd
    set $myshell = "/bin/csh"
    (ladebug) source my.cmd
    (ladebug) p $myshell
    "/bin/csh"

The file is listed with the cat command for illustration only; listing the file is not a required step.

GUI problems

The new GUI is still a work-in-progress, and not all of Ladebug's features are available in a "GUI-style" way yet. The GUI's command-input window may be used to access those features, using the traditional Ladebug command language.

Common GUI questions are addressed in the GUI FAQ .

If you find problems with the GUI, please report them to us! We also welcome suggestions and comments.

I can't debug my core file!

If you debug a core file on a system different from the one on which the core file was built, then you will have problems with shared library versions, because the core file itself does not contain the shared libraries used by your application.

The symptoms can range from subtly-wrong details to outright failure of Ladebug to let you debug. Unfortunately, Ladebug doesn't detect that you are debugging on a different version of the operating system than the core file was built on.

The good news is there is a way to do this correctly. The details are in the manual. The manual is large, so it may take some time to load.

What's this `libpthreaddebug.so` message mean?

You may see an error message like this:

    This libpthreaddebug.so version 318037 cannot connect to a process running
    libpthread 318042.

This can happen when you have a transported core file, and are debugging on a different machine or a different OS level than the core was created on. See the section above for how to deal with this.

Heisenbugs: how your application can be affected by Ladebug

We use the term "Heisenbug" for a bug caused by the use of a debugger; it is based on a humorous analogy with the well-known "Uncertainty Principle" described by Werner Heisenberg, which is often defined as saying "you can't look at something without impacting it". (We know this isn't what the "Uncertainty Principle" really says. Some of us have taken real physics courses. It's just a joke.)

There are several ways your application might notice that it is being debugged, and each of them has the potential to create a bug, depending on how your application works.

If your application reads its own instruction stream, it might see the breakpoint opcode (0x80) instead of the original instruction. Note that while you are debugging, Ladebug hides the fact that breakpoint instructions have been inserted from the user: if you use a memory display command, it will show you the instruction which was replaced by the breakpoint instruction. When the application needs to execute the original instruction, Ladebug will temporarily restore it.

If your application checks the protection of its own pages, it might see that some pages have been set non-writable by Ladebug. Ladebug implements watchpoints by changing the protection of the page which contains the watched variable or memory. Ladebug will un-protect the watched pages and single-step instructions which write to the page as part of its normal watchpoint processing, so the change to page protection is not normally visible to the application.

Ladebug uses SIGINT (control-C) to interrupt the application. If your application catches SIGINT, Ladebug may be unable to work correctly, and if your application generates SIGINT and expects to catch it in some particular code, Ladebug may swallow the signal invisibly or get confused.

If your application checks its own pid value, and you use the snapshot feature, your application will be able to detect that Ladebug has changed the identity of the process running your application.

If your application depends on exact timing, it may notice that it is slowed down when controlled by Ladebug.

Luckily, most applications do not do any of the things mentioned above. Unluckily, if they do, the confusion or bugs caused by interaction with Ladebug can be very hard to track down.

System patches that Ladebug users should install

The following Tru64 UNIX patches clear up various Ladebug debugger problems that have been reported to us.

Any pthreads patches.

Customers can get these from http://www.support.compaq.com/patches/

System crashes

The debugger has no special privileges or hooks into the system, and as such can not cause any crashes that another debugger or other program could not also cause, hence such crashes are fundamental system problems, not Ladebug problems.

However the debugger does exercise parts of the system (especially procfs, ptrace, and pthreads) more than most other programs, and hence crashes caused by these components are seen as 'the debugger crashed my system'. The Ladebug team sometimes gets such reports.

There are currently no such crashes known.

Reducing the size of debuggable executables

One reason .o and .so files can be large is the duplication of information in them, caused by the many interrelationships amongst the classes in the application.

Using the C++ -g rather than -gall switch can radically reduce the size of your executable files.

Further reduction can be attempted using an undocumented feature of the V5.6 and later C++ compilers. The compiler can be directed to put out each class once, and only once, into a set of .o files. The debugger will then find this single occurrence when debugging the application containing all these .o files.

This is done by defining the environment variable CXX_DEBUG_INFORMATION_CACHE to be the full file-name of an empty file before invoking your make script, as in the example below:

    setenv CXX_DEBUG_INFORMATION_CACHE $TMPDIR/136132834732823953.txt 
    rm -f $CXX_DEBUG_INFORMATION_CACHE
    touch $CXX_DEBUG_INFORMATION_CACHE
    make -f your_make_script 
    rm -f $CXX_DEBUG_INFORMATION_CACHE 
    unsetenv CXX_DEBUG_INFORMATION_CACHE

CAUTION: C++ V6.0 and V6.1 tend to eliminate too many classes with this option. As soon as they have seen the incomplete class, they add it to the cache, and forget to put out the complete class EVER. The C++ V6.2 compiler has the fix for this problem.

This capability may reduce the size of the .o and .so files in your application to the point where the Ladebug debugger can handle it easily. Feedback on this feature would be useful to us. If you find that your application size either decreases greatly or fails to decrease, please tell us so we can help the compiler group refine their approach.

Apart from these approaches, the only other approach we can suggest is compiling most of your sources -g0, and compiling just a small subset -gall.

What happened to all the routines on the stack?

If you do a where and the stack seems to be missing routines, you may be seeing the result of a compiler optimization called "tail calls". That optimization works like this:

If a procedure MIDDLE calls a procedure INNER just before returning and certain conditions are met, then MIDDLE might simply jump to INNER, instead of doing a call. No code has to be generated to do the stack manipuations which would save MIDDLE's context and restore it after the call to INNER. It's ok to do this because there's no use of the saved context after the call, because it's the last thing MIDDLE does.

After the call, INNER will execute and then return directly to MIDDLE's caller, OUTER, and there will be no record of MIDDLE's existence on the stack.

So if you stop the application in INNER and do where, you will see INNER and OUTER, but not MIDDLE.

Since this transformation can occur more than once, it is possible for several intermediate calls to appear missing from the context stack.

The conditions which permit this optimization include, among others:

If MIDDLE returns a value, then INNER produces the value to be returned;
No stack-allocated variables in MIDDLE are used during the execution of INNER;
MIDDLE does not establish any exception handlers.

There's no problem or bug when a routine is missing from the stack due to this optimization, but it can be confusing.

My stack makes no sense!

If what you see on a where is a stack which just doesn't make sense (i.e. random numbers without any routine names), then it's likely that your application has gotten lost.

    (ladebug) where
    >0  0x11ffff7d4

Typically this kind of stack means that your application has lost track of the real stack and real code location, and is now executing random bits of memory, interpreting it as instructions.

If you're coding in C++, one of the most common ways to get a nonsense stack is for your code to try to execute a method on an invalid object. If the object has already been deleted, has not yet been initialized, is not there, or is of a completely different class, then the virtual function table won't be correct, and the application will be treating random memory as the virtual function table and calling a random place.

The ring tool is a good way to track this kind of problem down, as it will let you find out where your application was when it made the call into nowhere.

How did I get here from the `exec`?

Normally, Ladebug shows you the execution point of your application, and if you next or step from that point, you'll probably get to the place you expect, unless something else happens.

There are two cases where this isn't true. One is on start-up, and the other is after an exec has been caught (by use of "set $catchexecs = 1").

In both cases, Ladebug presents the application to you as though it were about to execute the first line in main. In the start-up case, this is because there is as yet no execution point and it might as well present you with something you're probably interested in; in the exec case, this is because the application is executing system code to do the work of the exec, and is not yet executing the code you wrote, so we treat it like the start-up case.

When your application is in this state, you may set breakpoints on initialization routines and constructors for static or global items and know you'll hit them. This is why Ladebug doesn't automatically run your application forward to the start of main. But Ladebug has next to no information about the loader, and showing it to you doesn't make much sense, as that's not a context you can do any meaningful work in. So Ladebug shows your context as main.

Unlike the start-up case, in the exec case, it is possible to use motion commands since the application is running. However, the motion specified by next or step may lead to confusing results, as it will be a next or step from the position in system code. What that code does, and where a motion winds up stopping may vary from one OS release to another.

Thus for some releases, next from a caught exec will stop at __start while for others it will stop only when the application terminates. It is safer to place a breakpoint at a known place (like __start or main) and continue than to use next or step after catching an exec.

The use of a next or step command before the exec call is also hard to define; again, it's best to set a breakpoint after the exec is caught and continue than to try to follow the system code as it loads in a new image and transfers control, a process which can vary in details from release to release.

Why can't I call my function?

If you get the following error message when you attempt to call your function x, then the function may not be visible or may have another name:

Symbol "x" is not defined

Try running the application to a point where the function is visible to the current context by language rules. For example, static functions in other files are not visible outside of that file.

Or maybe your symbol is different by the time the C pre-processor is done. Consider the following example from the SPEC CPU2000 benchmark 253.perlbmk (a reduced version of the popular perl program):

    (ladebug) file sv.c
    (ladebug) list 3099,3101
       3099 I32
       3100 sv_cmp(register SV *str1, register SV *str2)
       3101 {
    (ladebug) call sv_cmp (0x14019a2a0, 0x14019a360)
    Symbol "sv_cmp" is not defined.
    (ladebug)

Although the original source code plainly says that the name of the routine is sv_cmp, this is not the name that the compiler sees. Find out what the compiler sees by just running the C pre-processor:

    % diff sv.c sv.c-commented
    3100c3100
    < sv_cmp(register SV *str1, register SV *str2)
    ---
    > sv_cmp(register SV *str1, register SV *str2)  /* plugh */
    % cc -E -C -DSPEC_CPU2000_DUNIX sv.c-commented > tmp.tmp
    % grep plugh tmp.tmp
    Perl_sv_cmp(register SV *str1, register SV *str2)  /* plugh */
    %

The above example uses two tricks to make the pre-processor output easier to interpret:

    -C          retains comments in the pre-processor output
    /* plugh */ is an arbitrary comment inserted to make it easy
                to find our place in the pre-processor output

By putting in a plugh and then grep-ing for it, we discover that by the time all the magic is done with perl's various .h files, our routine name has become Perl_sv_cmp. And, indeed, Ladebug can call it using that name:

    (ladebug)  print Perl_sv_cmp(0x14019a2a0,0x14019a360)
    0

My program hangs when I debug the part with a `fork`!
How do I debug a forked child? It forks but I don't get control!

If you catch the child but not the parent, and the parent code tries to execute a wait on the child, then the target will get stuck forever, with no progress being made.

Without the wait, if the parent process doesn't stop (no breakpoints or signals) and continues to compute, it can look like a hang, as you will have wait for the parent process to complete before you can debug the child.

Ladebug can only focus on one process at a time, and starts out focused on the parent. What's happened is that after the fork is caught, Ladebug is still focused on the parent, which is still running, so Ladebug isn't talking to you; the parent is at the wait call, so it's not talking to you; the child is stopped on the fork, but doesn't have Ladebug focus, so it's not talking to you.

We therefore recommend always stopping the parent as well as the child when debugging programs that fork.

If you think Ladebug is hung, try typing Ctrl/C, which should get the Ladebug prompt back after stopping the parent. The where command will show you whether the parent is in the wait call, and the show process all command will show you which process has Ladebug focus.

For an example, see the section of the manual on debugging programs that fork.

After I attach, the program doesn't stop!

When Ladebug attaches to a running program, it doesn't stop it automatically, so the program keeps running.

To stop it, you can type a control-C or click the Interrupt button on the GUI.

To stop all attached programs automatically, create a .ladebugrc file in your home directory containing this line:

    set $stoponattach = 1

This will tell Ladebug to stop all programs after attaching.

Why can't I `set` my variables? Why both `set` and `assign`?

Ladebug uses set to change debugger variables and assign to change program variables, because they have different lifespans and different semantics.

If you try to set a program variable, what you'll really do is create a new debugger variable with the same name as the program variable. When there is a conflict because a program variable and a debugger variable have the same name, it's undefined which one Ladebug will find when evaluating the name. The example shows how a set creates a new debugger variable rather than changing the value of the program variable. Note that in this case the debugger variable i is hidden by the program variable i, but becomes visible when the program variable disappears!

    (ladebug) p i
    0
    (ladebug) set i = 9
    (ladebug) p i
    0
    (ladebug) unload
    Process has exited
    (ladebug) p i
    9

For this reason, we encourage users to start all debugger variable names with a dollar-sign ("$"). None of the languages supported by Ladebug allows an initial dollar-sign, so there can be no name conflicts if you follow that convention.

Why can't I watch my variable?

Sometime when you try to watch a variable, Ladebug won't do it:

    (ladebug) watch variable i
    Unable to take address for i
    Warning: Watchpoint not set.

This happens when the variable is in a register, which has no address. Ladebug implements watch with page protection, which works on variables allocated in memory.

There are two ways around this problem:

You can change your code to make the variable in question be static, which will force it to memory. This may require other code changes to maintain the correct semantics.
You can use the stopi command. This will check the contents of the register after each instruction.
Since this only makes sense when that register contains the variable in question, we recommend that you use a helping pair of breakpoints to enable and disable the stopi breakpoint in such a way that it is only enabled during the lifetime of the variable in question: (ladebug) stopi i (ladebug) when at 100 { enable 1 } (ladebug) when at 200 { disable 1 }

Please note that the second solution may give odd results if the lifetime of the variable extends beyond the area where the stopi breakpoint is enabled: any changes in that part of its lifetime will be announced when the breakpoint is re-enabled.

Further, if the stopi breakpoint is enabled outside the lifetime of the variable, Ladebug will complain because the reference is now undefined.

In general, Ladebug recommends watch over stopi where possible, for these and other reasons.

Why does `print` get a parse error?
Why can't I print `state` or `thread`?

If you are running into this problem, you are most likely using Ladebug -64 or older. What follows explains when this problem would occur in older Ladebug debuggers and how to get around it.

As the manual mentions, variables with names which are the same as Ladebug keywords will produce parse errors due to the way Ladebug parses commands.

The names in question are the following:

at
if
in
state
thread
with

To print these names, make them expressions by enclosing them in parentheses:

print (state)

Note, however, that in -65, you still need to parenthesize the identifiers that coincide with thread, at, if, and in if and only if they occur in the expression's in the following commands:

where expression
stopi expression
trace expression
tracei expression
wheni expression

The parameters shown by `where` are wrong!

If you do a where command and look up the stack (down the listing) at the reported parameters, you may be surprised to find that they are shown with different values than you expect them to have.

Parameters are passed in places defined by the Alpha calling standard (the first six in registers R16, R17, etc.). Compilers generating code for a routine may decide to leave a parameter in a register, or move it to another register or to a location in the stack. If the parameter isn't used after a certain location in the code, the compiler may decide to re-use the register or stack location for another value in the code following that point.

The compiler describes this to the debugger through debug information in the executable. These records tell Ladebug where a variable may be found when the routine is executing within a specified PC range.

When Ladebug does the where command, it looks up the stack, and does a virtual unwind, re-recreating for itself the context of the routines on the stack. Then it uses the debug information from the compiler and the routine's recreated PC value to figure out where the parameter variables are located so it can extract and print their values.

There are two ways this can fail without being a bug:

If the routine in question has no debug information, then Ladebug will assume that parameters are in the locations specified by the calling standard. This is only certain to be true at the very beginning of a routine, and thus may often give wrong results. In a future release, Ladebug will no longer "guess" this way.

If the routine is optimized, then the debug information may not completely describe the current location of the parameter, and Ladebug may be fetching a value from the wrong place.

Our advice for debugging is to always debug an un-optimized executable (-g) if you can; if you must debug an optimized executable (compiled -g3 -O<something>), be aware that occasionally Ladebug will be confused by an optimization.

Reporting Problems

Get the latest compilers and Ladebug kits and see if the problem is fixed.

If you find that the problem is not fixed, send mail to Ladebug.Support@hp.com containing the following:

Ladebug version (from the Ladebug welcome message)
Operating system version (from uname -rsv)
Any traceback or messages from Ladebug
If possible, the sources of a small reproducer

The smaller the reproducer, the more likely you are to have the problem fixed in the near future.

If you have an idea for another FAQ item, please send mail to the Ladebug team. Thanks!

Ladebug FAQ

Product Information

Making the Most of Ladebug

"I hate that feature"

Problems with Ladebug

Kits for Tru64 UNIX

Kits for Alpha Linux

What was my program doing just before it got lost? or How can I see a history of my program's execution?

Why does print get a parse error? Why can't I print state or thread?

What was my program doing just before it got lost?
or
How can I see a history of my program's execution?

Why does `print` get a parse error?
Why can't I print `state` or `thread`?