What is Ladebug?
What versions of Ladebug are there?
How do I get a kit?
Is there a hard-copy manual?
What other development tools do you recommend?
What can I do to prepare my application for debugging?
All I want to do is add a printf
!
How can I get my program to start Ladebug?
How can I see a history of my program's execution?
What was my program doing just before it got lost?
How do I use Emacs with the debugger?
How do I stop in the main Fortran program?
How do I examine Fortran arrays?
How do I examine items in Fortran Modules?
How do I set a temporary or counted breakpoint?
How do I set a conditional breakpoint using a variable in the breakpoint's
define-time scope?
How can I get to 'just before' where my application causes a SEGV?
How can I debug a stripped binary?
How do I relate a raw PC to my application code?
How can I undo a Ladebug command?
How do I send a signal to my application?
How do I debug in a library opened with dlopen
?
How do I use a Ladebug $variable in a shell command?
How do I use a shell variable in Ladebug?
I hate having an empty line mean "repeat the previous command"
I hate the garbage characters for line-editing!
I hate the command language!
I hate the way Ladebug is so verbose!
I hate the way the links in the manual don't work all the time!
I hate the way the edit
command stops the debugger!
Ladebug hangs when I debug my application!
Ladebug hangs when I do a run
!
My program hangs when I debug the part with a fork
!
How do I debug a forked child? It forks but I don't get control!
After I attach, the program doesn't stop!
Why can't I set
my variables?
Why both set
and assign
?
Why can't I watch my variable?
Why can't I see the commands I am typing in?
What does opaque
mean?
The debugger outputs Assertion failed
and a traceback.
What do I do?
GUI problems
I can't debug my core file!
What's this libpthreaddebug.so
message mean?
Reducing the size of debuggable executables
What happened to all the routines on the stack?
My stack makes no sense!
How did I get here from the exec
?
Why can't I call my function?
Why does print
get a parse error?
Why can't I print state
or thread
?
The parameters shown by where
are wrong!
My signal disappeared!
Ladebug crashes with socket error
when I type
n
to a More
prompt
during a parallel debugging session!
Heisenbugs: how your application can be affected by Ladebug
System patches that Ladebug users should install
System crashes
Reporting problems
Ladebug is a debugger, very much like dbx
or gdb
;
it has a command-line mode and a GUI.
The command language is similiar to dbx
's.
Ladebug runs on HP Tru64 UNIX and Linux for HP Alpha systems. It is available for free: kits can be downloaded over the Net.
The Linux version of Ladebug is still in development, and not all
of Ladebug's capabilities are available there; see the
README
file in the Linux kits for the current list
of unsupported capabilities.
Ladebug has separate versions for each major release of the operating system for best performance, but is generally upward-compatible. It is not generally downward-compatible.
On Tru64 UNIX, there are versions for v4.0 and v5.0. The v4.0 version does not support kernel debugging on v5.0, but otherwise will run fine.
On Linux, Ladebug versions are generally upward-compatible.
The latest version is available on the internet. There are v4.0 and v5.0 versions of this kit, both containing the GUI. The v4.0 version of the kit will install and run on v5.0 systems, but the v5.0 kit provides better kernel debugging support for v5.0 (and higher) systems.
Ladebug now has a built-in GUI, replacing the previous
dxladebug
GUI.
The Ladebug provided with the base OS (i.e. the one which comes with the operating system) doesn't include the GUI in some OS versions due to space limitiations on the CDROM. To get a Ladebug with the GUI in that case, download a Ladebug of the appropriate type.
Customers can get a kit over the Internet from either:
We (the Ladebug team) prefer you to use the latter - a lot of testing goes in before we declare it to be "field-test-quality"!
See the Fortran for Alpha Linux or the Ladebug debugger web site.
There is no hard-copy manual. Ladebug's manual exists only as HTML source, to support searching and rapid updating.
If you need hard-copy, most browsers will let you print the current page; for Ladebug, the current page is the whole manual. The manual's formatting has been made flexible as part of being a web document, so it doesn't depend on page breaks or page widths.
Having the whole manual be one page has drawbacks--it can take a long time to load. Breaking it up has drawbacks, too--you can't search it as easily, and it complicates internal links. Please tell us whether you'd prefer us to break the manual into a set of HTML chapters, or like it this way!
Visual Threads - a unique and fully visual tool to analyze and debug multithreaded programs.
Developers' Toolkit for Tru64 UNIX - a prerequisite for programming under Tru64 UNIX, the kit includes the Compaq C compiler, and application debugging, profiling, analysis, reordering, and porting tools.
ATOM - a toolkit from the Developer's Toolkit above, but worth listing on its own! It lets you modify an application so that it will perform run-time analysis on itself. We use ATOM to build the history tool, for example.
There are things you can do to your program to make it
easier to debug that go beyond using -g
on the compile line:
free
and malloc
,
you can wrap the calls to those routines inside your own
memory getting and freeing routines. This will let you add
checking or memory-optimization code later, if you want.
But for debugging it will also let you turn off free
-ing
completely.
If you have a bug, and when you comment out the free
in your wrapper routine it goes away, you've just found out that
your bug is due to a memory-usage error like using a pointer to
deleted storage or reading uninitialized data. Wrapper routines
will also give you a point to set a breakpoint on to set a watch
on a particular chunk of memory.
If you know that the bad access is a write to a particular memory
address, for example, you can set a temporary breakpoint in the
malloc wrapper
to stop when the block containing that address is allocated, and
then set a watchpoint on that memory.
assert
package),
which can verify assumptions in your code. For example,
you can add in checks that pointers are not null,
that lists are not circular, and that things which should be
sorted are sorted.
#ifdef
would allow
this code to be included only in the development version of
your application.
#ifdef
will allow
you to keep the logging overhead out of the production version
Ladebug itself uses logging controlled by environment variables, self-dumping data-structures and thousands of ASSERTs. We have found these very helpful in finding and fixing bugs.
printf
!
Ladebug can be used to add printf
s to your code without
the effort of recompilation and relinking.
Here's an example, showing how to add a run-time printf
at line 10 of the file filename.C
in the program
myprog
;
user commands to Ladebug are bold:
$ ladebug myprog
Welcome to the Ladebug Debugger Version ...
object file name: /usr/users/aperson/myprog
Reading symbolic information ...done
(ladebug) file filename.C
(ladebug) when at 10 { printf "Value of x is 0x%x, of i is %d\n", x, i }
(ladebug) run
... program runs normally ...
Value of x is 0x1234, of i is 23
... and you get the printf output!
The major differences are that the Ladebug command printf
doesn't need parentheses or a trailing semicolon, and you can add and
modify the Ladebug printf
without recompiling your application.
Since the debugger is involved, the Ladebug printf
is slightly
slower than recompiled code would be, but it is more flexible.
debugBreak
is a function that can be added to any program.
When the debugBreak
function is called, it creates a Ladebug
debugger process initialized so that the debugger can connect to the
program, using the attach
function of Ladebug.
The debugger winds up ready for input from the user, as though there
had been a breakpoint at the end of the call to the debugBreak
function.
debugBreak
is described in more detail in
its own note.
The history tool can be used with almost any program. It modifies your original program so that the program will record a history of the source lines it executes and the calls it makes.
The history tool is currently a unsupported prototype, and may change. It is available in source form; the buffer size and other factors can be changed by editing the source. It depends on ATOM, and will not work if ATOM is not available on your platform.
The history tool is described in more detail in its own note.
In interactive sessions, Ladebug repeats the previous command when you just
press the return key again. For many users, this is a convenience feature,
and they like the ability to step
through a long sequence with
one key-press per step.
But some don't, and there's a way to make interative sessions work like
Ladebug command scripts, and have blank commands be no-ops: set the
$repeatmode
variable at the command line or in a
.dbxinit
script:
(ladebug) p 12
12
(ladebug)
12
(ladebug) set $repeatmode = 0
(ladebug) p 13
13
(ladebug)
(ladebug)
If you capture Ladebug's output in a way that Ladebug doesn't notice (e.g. by invocation in an emacs buffer), you can see that Ladebug issues lots of escape characters to support line editing.
Unfortunately, that's about all you can see, because the extra junk
pretty much obscures the actual input or output. It might look like
this (the original command typed in is print 5 + 2
):
^[[Jp^[[Jr^[[Ji^[[Jn^[[Jt^[[J ^[[J5^[[J ^[[J+^[[J ^[[J2
If you're willing to live without line-editing support, you can get
rid of this using a control variable, which can be set either at
the command line or in a .dbxinit
file:
(ladebug) set $editline = 0
Some users prefer gdb
's command language. We can't help
there.
But with aliases you have a limited ability to tailor the command language to
suit you better. For example, if you want ret
to mean
ret
and pp
to mean print *
,
you can write a .dbxinit
file which defines them that
way:
alias ret "return"
alias pp "print *"
Some people like lots of messages and some don't. There are several
controls to make Ladebug much less noisy. Set them as indicated if
you like a terse debugger:
set $doverbosehelp = 0
set $giveladebughints = 0
set $showlineonstartup = 0
set $showwelcomemsg = 0
set $stackargs = 0
set $statusargs = 0
See the manual for details.
The manual is a large file, often taking several seconds to load. If you click on a link before the file is finished loading, the browser may not find the target of the link. To correct this, check the status bar at the bottom of the browser to make sure the file is done loading, then try the link again.
edit
command stops the debugger!
Since Ladebug doesn't check the DISPLAY
environment variable
or the value of the EDITOR
environment variable, it can't
predict whether you'll be editing in the same window as the debugger or
not. For example, if DISPLAY
had no definition, and
EDITOR
were emacs, the edit
command will cause
emacs to take over the current terminal window, while if
DISPLAY
were defined to a particular node, emacs would
come up in a different window.
To prevent contention for the command-line window, Ladebug pauses while the editor is running, and you can only start issuing Ladebug commands again when you've exited the editor.
If you have set DISPLAY
in such a way that the editing
window will be a different window, and want to continue giving Ladebug
commands while you continue editing, you can do this using the
sh
command and the &
asynchronous
command terminator. Here's an example showing how to do it for emacs:
(ladebug) sh emacs /usr/users/me/foo.c &
To find out the path to the current file, print $curfilepath
.
Unfortunately, you'll have to cut and paste the result into the
sh
command line, as shell commands are passed to the
shell uninterpreted (i.e. for "sh emacs $curfilepath &
"
Ladebug won't evaluate $curfilepath
, but the shell will
try to!).
If your application does something wrong,
you might take the understandable shortcut of recalling the
command and prepending ladebug
to debug it.
$ foobar < foobar.input > foobar.output
*BANG*
$ ladebug foobar < foobar.input > foobar.output
But on this command line, ladebug
is the main application,
and it will read from foobar.input
and write to
foobar.output
, thus ignoring the input from your
terminal.
Use the run
command to add any parameters and
IO redirection:
$ ladebug foobar
:
(ladebug) run < foobar.input > foobar.output
This will redirect your applications IO, but not Ladebug's.
To redirect Ladebug's IO, see
the manual section for
record
. Note: The manual takes a few seconds to load.
run
!
The rerun
command repeats the arguments and any file
redirections from your first run
command, as does
r
, which is an alias for rerun
.
A run
command will not repeat the arguments. This means
that your application will probably attempt to read from
stdin
, which is probably the terminal you invoked
Ladebug from.
If you are using the command-line interface, this may look like
a failure to get a response from Ladebug, if your application does not
hit a breakpoint before it reads from stdin
.
Because the run
command does not produce any output,
this can look like a hang. To get back to the Ladebug command line, enter
Ctrl/C.
Because Ladebug offers support for editing command lines, it queries the terminal for the number of lines and columns. If the actual terminal size is different than the answers Ladebug gets, then Ladebug may not correctly display your input. The most common symptom is not being able to see the command as you type it in.
If this happens, check that you have correctly set the size information for your terminal at the shell-command level.
For example, if your terminal is 47x80:
% stty rows 47 ; setenv LINES 47 % stty cols 80 ; setenv COLS 80
Another way to ensure that your terminal is correctly sized is to use the
resize
command at the shell prompt. See the resize
man page for details.
opaque
mean?The debugger sometimes prints out:
Information: An <opaque> type was presented during execution of the previous command. For complete type information on this symbol, recompilation of this program will be necessary.
This happens when it can find the struct
, union
,
or class
identifier declaration in the debugging information,
but can find no information about the identifier's definition.
This can happen for two reasons:
-g
",
and those sources contained the struct
as an incomplete
type:
struct S;
Other sources containing the struct definition
struct S { ... }
were compiled without the -g
, or didn't even exist.
cxx -gall
in the cxx
man page and release
notes for more
information.
Assertion failed
and a
traceback. What do I do?The debugger does lots of internal checks, and also checks on the executable files and shared libraries that it is loading.
If these checks show an internal inconsistency, the assertion gets triggered. Please report the problem to us so we can fix it.
In many cases, Ladebug can recover and you will be able to continue debugging. In severe cases, when it cannot recover, Ladebug will exit.
We want Ladebug to be usable and robust: if you find an error, please tell us!
You can control your debugger process entirely through the Emacs GUD (Grand Unified Debugger) buffer mode, which is a variant of shell mode. All the Ladebug commands are available and you can use the shell mode history commands to repeat them.
Ladebug version 4.0-48 and higher support GNU Emacs Version 19 and
higher.
Ladebug version 4.0-58 and higher support Lucid XEmacs Version 19.14
and higher.
The information that follows assumes you are familiar with Emacs and are using the Emacs notation for naming keys and key sequences.
For each Emacs session, before you can invoke the debugger, you must load the Ladebug-specific Emacs LISP code, as follows:
M-x load-file
At the Load file:
prompt, type:
/usr/lib/emacs/lisp/ladebug.el
You can also place a load-file call in your Emacs initialization file
(~/.emacs
). For example:
(load-file "/usr/lib/emacs/lisp/ladebug.el")
M-x ladebug
If you use symbolic links to access your sources, and/or if you
use NFS-mounted file systems, you probably want to adjust the
Emacs variable directory-abbrev-alist
in your Emacs
initialization file (~/.emacs
).
For example, if your
home directory is /usr/users/fred
, its name may show
up as /tmp_mnt/var/users/fred
because of NFS
mounting, and Emacs may shorten this to
/var/users/fred
. This can cause confusion if you
load the file ~/src/prog.c
into Emacs (which gets
expanded as /usr/users/fred/src/prog.c
).
Later, the debugger will tell Emacs to load
/var/users/fred/src/prog.c
. You now have two
different buffers (likely named "prog.c
" and
"prog.c<2>
") trying to contain the same file
but under different full names (/usr/...
and
/var/...
). You will probably use the first, the
debugging session the second. The solution is to put something
like what follows in your Emacs initialization file:
(setq directory-abbrev-alist
(cons (cons "^/var/users/" "/usr/users/")
directory-abbrev-alist))
This will cause Emacs to
translate filenames beginning with /var/users/
to be
/usr/users/
instead, which will then find the file
under the name you would expect. (The above assumes all
usages of of /var/users/
should be redirected to
/usr/users/
.) A little experimentation will reveal
what translations you need for your environment. For more
information on directory-abbrev-alist
, do
m-X describe-variable [Return] directory-abbrev-alist
For more details, see the manual. Note: The manual takes a few seconds to load.
When a Unix application is started, the system routine __start
calls a routine named main
. Since Fortran programs may not have
such a routine, the Fortran RTL includes a file (for_main.c
)
which has a main
which will call your main program unit.
This explains why you stop in for_main.c
if you've followed the
C habit of using "stop in main
" as your first Ladebug
command. If you do stop there, you can do a few step
commands,
and follow the call to Fortran code.
The simplest way to get to your 'main' program is for you to
just set a breakpoint on it by name:
(ladebug) list 1:2
1 program for_test
2 logical*1 b
(ladebug) stop in for_test
[#1: stop in subroutine for_test() ]
(ladebug) r
[1] stopped at [subroutine for_test():7 0x120001554]
7 10 b = 1
Arrays of strings are incorrectly understood as though the array had an extra left-most dimension indexing the individual characters, rather as though it were an array-of-array-of-character.
For this reason, the debugger displays the following error message when
you try to use correct Fortran syntax:
(ladebug) print a(1)(:1)
print a(1)(:1)
^
Unable to parse input as legal command or Fortran expression.
Also for this reason, the debugger accepts expressions that Fortran would
not, for example:
(ladebug) print a(1:2,2:3)
(2) "ef"
(3) "ij"
You should use this second form to deal with arrays of strings.
If you are using Ladebug V66 or earlier, there is no support for Fortran Modules: you can't set your scope to a module, or print things by their name within a module.
But because the items do exist, they have "mangled" names in the linker table, and you can print them using the mangled name.
An item named an_item
in a module named a_module
will have the mangled name $a_module$an_item
(ladebug) p $a_module$an_item
42
If you are using Ladebug V67 or V68 and your object file is compiled
by a Fortran compiler that emits module debug information, then you can
use the rescoping syntax to refer to a module component. For
example:
(ladebug) p `a_module`an_item
42
If you are using Ladebug V69 or later and Fortran version 5.5A or later, module components which are visible to the code are visible to Ladebug and can be referenced without the rescoping:
(ladebug) p an_item
42
Components in a module not made visible by a USE
statement will still require the rescoping syntax to specify the
module they are in, of course.
The modules themselves can be printed, as well:
(ladebug) print MIM_A
module MIM_A
use MIM_B, only:
BY = 3
BX = 2
AX = 1
end module MIM_A
Ladebug lets you construct temporary or counted breakpoints using
debugger variables and if
clauses:
Use $curevent
and have the breakpoint's action list delete itself:
(ladebug) stop in f { delete $curevent }
Use a debugger variable to count breakpoint hits you want to skip,
and a helper breakpoint to increment the count:
(ladebug) set $bpt_count = 0;
(ladebug) when at main { set $bpt_count = 0 }
(ladebug) when at 12 { set $bpt_count = $bpt_count + 1 }
(ladebug) stop at 12 if $bpt_count > 10
If your current language is C or C++,
you can use $bpt_count++
.
This is much like the counted breakpoint, with a different condition:
(ladebug) set $bpt_count = 0;
(ladebug) stop at 12 if $bpt_count < 10 { set $bpt_count = $bpt_count + 1 }
One of the strengths of the Ladebug approach is that you can construct
complex breakpoints using these mechanisms. Here's a breakpoint which
triggers on the tenth and eleventh times and then deletes itself:
Note the number of that breakpoint (call it 1).
(ladebug) set $bpt_count = 0
(ladebug) when at 59 { set $bpt_count = $bpt_count + 1 }
The breakpoints are disabled rather than deleted so that the same
setup can be used on a rerun.
(ladebug) stop at 59 if $bpt_count >= 10
(ladebug) when at 59 if $bpt_count == 11 { disable 1, 2, 3}
Breakpoint conditionals in Ladebug are evaluated in the scope in which the breakpoint triggers, not in the scope in which you define the breakpoint.
If you have variables with the same names in different scopes, and you wish to set a breakpoint that triggers based on the value of a variable in the current scope, use a debugger variable.
By setting the debugger variable to the address(&) of the test variable, you can then set your conditional breakpoint based on the value of the dereferenced debugger variable.
For example, consider the following example:
(ladebug) list 1
1 #include <stdio.h>
2
3 int x = 0;
4 int sum = 0;
5
6 void bar(int x_from_main)
7 {
8 int x = 5; /* Inner 'x' */
9 sum += x_from_main%x;
10 printf("x in bar = %d\n",x);
11 }
12
13 void main()
14 {
15 for (x = 1; x < 10; x++) /* Outer 'x' */
16 {
17 printf("x in main = %d, ",x);
18 bar(x);
19 }
20 printf("final sum = %d\n",sum);
21 }
(ladebug) stop at 18
[#1: stop at "main.C":18 ]
(ladebug) run
[1] stopped at [void main(void):18 0x1200018c8]
18 bar(x);
(ladebug) print x
1
If you created a breakpoint conditional on x
, that
would test the inner x
:
stop in bar if x > 5
To stop in bar
only when
the outer x
becomes greater than 5,
not the x
in bar
, create
a debugger variable $mainX
and use it to
point to the x
in main
:
(ladebug) print $mainX Symbol "$mainX" is not defined. (ladebug) set $mainX = &x (ladebug) print *$mainX 1 (ladebug) stop at 9 if *$mainX > 5 [#3: stop at "main.C":9 if *$mainX > 5 ] (ladebug) disable 1 (ladebug) cont x in main = 1, x in bar = 5 x in main = 2, x in bar = 5 x in main = 3, x in bar = 5 x in main = 4, x in bar = 5 x in main = 5, x in bar = 5 [3] stopped at [void bar(int):9 0x12000182c] 9 sum += x_from_main%x; (ladebug) up >1 0x1200018d0 in main() "main.C":18 18 bar(x); (ladebug) print x 6Note that the define-time variable must have a lifetime which includes the evaluation of the conditional. If the variable has been deleted or is a local variable in a routine no longer on the stack, then the value that is fetched through the pointer in the conditional is undefined, and will probably be wrong.
If your application gets a SIGSEGV or other signal in a routine which is called many times, it might help to know which call is the one to cause the problem.
There is an ATOM-based tool "counter" which will instrument your program and count calls. When a signal is raised, it can calculate the correct Ladebug commands to use to get you to "just before" the event which causes the signal.
You might also want to consider instrumenting the program with this tool for ordinary testing or daily use. The run-time overhead is relatively low compared to the ring tool (there's a slowdown of about 50%), and it can provide you with information which could otherwise take some time to figure out.
Users could combine this tool with debugBreak (or the stack-dumping code at this location in the Ladebug manual). Note: The manual takes a few seconds to load.
Such a combination could cause the debugger to come up on signals, or to dump a stack when a signal occurred.
This tool is currently a unsupported prototype, and may change. It is available in source form, so that users may customize it by editing the source.
"Counter" is described in more detail in its own note.
When you strip a binary with strip
or ostrip
,
any debugging information is removed. Ladebug can't show you names,
variables, or sources.
If your stripped code crashes and your customer sends you the traceback or core file, you'll only have hex addresses and machine instructions to work with.
There following are possible workarounds:
This requires having both versions of the application at the customer
site, or having a reproducer of the problem at your site. If at the
customer site, Ladebug still won't be able to show you your sources,
unless the remote site also has your source files.
Now you can use the debuggable version to translate routine names
to addresses and vice versa, using commands like p &main
,
and setting breakpoints at the hex addresses using stopi at
.
A big advantage of this approach is that you can see sources on your
side. A disadvantage is that the remote side doesn't have any type
information, so you have to dead-reckon through data structures and
it's easy to make mistakes.
ostrip
and produce an stb
file, which contains the debugging information, and ship it with
your application to the customer site. There you can use that
file to re-create an unstripped executable and debug with it.
There's no real difference between this approach and shipping a
debuggable version, but it does allow the customer to store the
stb
file off to the side and run the smaller stripped
file most of the time.
One complication here is that there was a bug in ostrip
in version V4.0 of Tru64 UNIX. But there is a workaround!
On V5.0 and higher, use this sequence:
$ ostrip -t your-executable # Produces .stb file
# Ship your-executable and your-executable.stb
# to the customer. When you wish to debug remotely,
# get the stb file, put it in the same directory
# as the executable and re-join with this command:
$ ostrip -j your-executable # Join them
# your-executable now has debugging information
On V4.0 systems, use this sequence:
$ ostrip -c your-executable # Adjusts for work-around
$ ostrip -t your-executable # Produces .stb file
# Ship your-executable and your-executable.stb
# to the customer. When you wish to debug remotely,
# get the stb file, put it in the same directory
# as the executable and re-join with this command:
$ ostrip -j your-executable # Join them
# your-executable now has debugging information
If you have obtained a PC (program counter) value and you want to find
out where that PC occurs in your source code, use the Memory Display
Commands to view that PC address as an assembly instruction. Note that
Ladebug maintains the current value of the PC in $curpc.
(ladebug) print $curpc
0x120000b98
(ladebug) 0x120000b98/i
int main(void): test.c
*[line 12, 0x120000b98] bis r31, 7, r1
The first command printed the contents of $curpc
,
to obtain a PC value. The second command is a memory display command,
and the "i
" means to format the contents of
memory at the given address into an assembly instruction.
The first line of output tells you the function name int main(void)
and source file test.c
that the PC occurs in. The second line of output tells you that
this is the current location of the debuggee (the *
),
that we're at line 12 in the source file, the correct PC value for
this line, and the assembly instruction.
It happens all the time: you were closing in on the bug, but then you
did one next
or cont
too many and you're past
the point where the bug happens. If only you could undo it!
The short how-to is like this:
... Oops! It went past the bug!
(ladebug) save snapshot
# 1 saved ....
(ladebug) cont
... Now back before the "cont".
(ladebug) clone snapshot
For a fuller explanation, see the manual, which explains how to use snapshots as an undo mechanism here. Please note that the manual make take some time to download!
Ladebug clears all signals when it starts, steps or otherwise continues
the application, because it expects that the signal which stopped the application
is one caused by Ladebug (e.g. the SIGINT
raised by
a breakpoint instruction).
If your application handles signals, and the signal is not one put in there by Ladebug, but one you want your application to handle, then this can be a problem, because Ladebug's normal signal-clearing action will have caused your signal to disappear.
The sequence of events is:
cont
To send the lost signal (or any other signal) to your application,
use the cont <signal-value>
form of the
cont
command:
or
(ladebug) cont SIGSEGV
(ladebug) cont 11
As previously mentioned, by default Ladebug will stop when encountering a
signal that it thinks it should handle (even if you've defined a handler
for it in your program). You can prevent Ladebug from handling the signal
(thus allowing your defined handler to handle it), by using the
ignore <signal-value>
form of the
ignore
command. For example, if you have a SIGSEGV
handler defined, you would:
or
(ladebug) ignore SEGV
(ladebug) ignore 11
To see the list of signals that Ladebug expects to handle, use the
catch
command (with no arguments). To see the list of
signals that Ladebug expects the program to handle, use the
ignore
command (with no arguments). A target program must
be loaded to use these commands. For example:
$ ladebug your-executable
(ladebug) catch
INT, QUIT, ILL, TRAP, ABRT, EMT, FPE, BUS, SEGV, SYS, PIPE, TERM, URG, STOP,
TTIN, TTOU, XCPU, XFSZ, PROF, USR1, USR2, VTALRM, RTMIN, RTMIN1, RTMIN2,
RTMIN3, RTMIN4, RTMIN5, RTMIN6, RTMIN7, RTMAX, RTMAX7, RTMAX6, RTMAX5,
RTMAX4, RTMAX3, RTMAX2, RTMAX1
(ladebug) ignore
HUP, KILL, ALRM, TSTP, CONT, CHLD, WINCH, IO
We recommend against ignoring SIGINT
.
It will lead to problems because Ladebug would not be able
to see any breakpoints, including the temporary ones put in
for step
, next
, stepi
and nexti
.
socket error
when I type
n
to a More
prompt
during a parallel debugging session!
This is a known bug in V66. To get around the socket error,
enter the following command, either in a script or at the
Ladebug prompt. It will turn off paging.
set $page = 0
In V67, there is still a bug, but Ladebug now includes an
internal workaround, so you don't need to set $page
.
Ladebug now treats an n
response to the More
prompt
as if it were not an n
. The listing thus continues.
We are sorry we did not have time to fix this bug.
dlopen
?
Debugging inside shared libraries opened with dlopen()
is
possible, but it can be tricky. Ordinary shared libraries linked
into the application work just fine because Ladebug knows about them when the
program loads, and can read the debug information for the library and
set breakpoints using that debug information.
But Ladebug doesn't know which library is going to be opened by
a dlopen
call, and so can't see the library or
read its debug information. It will automatically load the debug
information when the library is actually opened, so the breakpoint
can be set at that time.
The trick is to stop at the right point in the library opening sequence, and then print out the name of the file being opened. At that point, Ladebug will have loaded the file's debug information and you can set breakpoints in the library.
This example prints the name of the library so that the user could
just continue if this were the wrong one rather than returning.
(ladebug) stop in foobar # Won't work before the library is loaded
Symbol "foobar" is not defined.
foobar has no valid breakpoint address
Warning: Breakpoint not set
(ladebug) stop in __dlopen { print (char*)$a0 }
(ladebug) r
[1] stopped at [<opaque> __dlopen(...) 0x3ff800dc4cc]
0x140000010="./library.so"
(ladebug) return # Finishes the load. Ladebug has read the debug information.
(ladebug) stop in foobar # Can now set your breakpoint
You only need to set the breakpoint once per session; on a
re-run, Ladebug will correctly re-install the breakpoint,
and you don't have to stop at __dlopen
again.
This command does it all in one step, if you know the library
of interest is the first one to be opened:
when in __dlopen { disable $curevent; return; stop in foobar }
If you want to use a Ladebug debugger variable in a shell command, you can't just put it in the command, as Ladebug doesn't look at the body of a shell command.
There is a trick you can use using the record
feature.
The example shows how to do this, moving the value of $hexints
into a file which can be used in the shell command echo
:
(ladebug) record output foobar
(ladebug) p "ME", $hexints
ME 0
(ladebug) unrecord output
(ladebug) sh grep ME foobar | awk '{print $3}' > my_pid
(ladebug) sh echo `cat my_pid`
0
The grep
is there to get rid of extraneous lines
(in this case the prompt for the line with unrecord
), and the
awk
selects the desired output from a line containing
the prompt, "ME" and the zero.
If you look at the generated filefoobar
, you will see that it ends with "(ladebug)
" without a trailing new-line. So if you dosh cat foobar
there will an apparent double prompt: the last line offoobar
and the actual new prompt.
As in exporting Ladebug variables, you have to make a trip
through a file. Here's one method, where you construct a
file of Ladebug commands and then use the source
command to execute it.
The example shows importing the shell environment
variable SHELL
into Ladebug as the Ladebug
variable $myshell
.
(ladebug) sh echo "set \$myshell = \"$SHELL\"" > my.cmd
(ladebug) sh cat my.cmd
set $myshell = "/bin/csh"
(ladebug) source my.cmd
(ladebug) p $myshell
"/bin/csh"
The file is listed with the cat
command for illustration
only; listing the file is not a required step.
The new GUI is still a work-in-progress, and not all of Ladebug's features are available in a "GUI-style" way yet. The GUI's command-input window may be used to access those features, using the traditional Ladebug command language.
Common GUI questions are addressed in the GUI FAQ .
If you find problems with the GUI, please report them to us! We also welcome suggestions and comments.
If you debug a core file on a system different from the one on which the core file was built, then you will have problems with shared library versions, because the core file itself does not contain the shared libraries used by your application.
The symptoms can range from subtly-wrong details to outright failure of Ladebug to let you debug. Unfortunately, Ladebug doesn't detect that you are debugging on a different version of the operating system than the core file was built on.
The good news is there is a way to do this correctly. The details are in the manual. The manual is large, so it may take some time to load.
libpthreaddebug.so
message mean?
You may see an error message like this:
This libpthreaddebug.so version 318037 cannot connect to a process running
libpthread 318042.
This can happen when you have a transported core file, and are debugging on a different machine or a different OS level than the core was created on. See the section above for how to deal with this.
We use the term "Heisenbug" for a bug caused by the use of a debugger; it is based on a humorous analogy with the well-known "Uncertainty Principle" described by Werner Heisenberg, which is often defined as saying "you can't look at something without impacting it". (We know this isn't what the "Uncertainty Principle" really says. Some of us have taken real physics courses. It's just a joke.)
There are several ways your application might notice that it is being debugged, and each of them has the potential to create a bug, depending on how your application works.
SIGINT
(control-C) to interrupt the application.
If your application catches SIGINT
, Ladebug
may be unable to work correctly, and if your application
generates SIGINT
and expects to catch it in some
particular code, Ladebug may swallow the signal invisibly or get
confused.
pid
value,
and you use the snapshot
feature, your application will be able
to detect that Ladebug has changed the identity of the process running
your application.
Luckily, most applications do not do any of the things mentioned above. Unluckily, if they do, the confusion or bugs caused by interaction with Ladebug can be very hard to track down.
The following Tru64 UNIX patches clear up various Ladebug debugger problems that have been reported to us.
Customers can get these from http://www.support.compaq.com/patches/
The debugger has no special privileges or hooks into the system, and as such can not cause any crashes that another debugger or other program could not also cause, hence such crashes are fundamental system problems, not Ladebug problems.
However the debugger does exercise parts of the system (especially
procfs
, ptrace
, and pthreads
)
more than most other programs,
and hence crashes caused by these components are seen as 'the debugger
crashed my system'. The Ladebug team sometimes gets such reports.
There are currently no such crashes known.
One reason .o
and .so
files can be large is
the duplication of information in them, caused by the many interrelationships
amongst the classes in the application.
Using the C++ -g
rather than -gall
switch
can radically reduce the size of your executable files.
Further reduction can be attempted using an undocumented feature of
the V5.6 and later C++ compilers. The compiler can be directed to put
out each class once, and only once, into a set of .o
files.
The debugger will then find this single occurrence when debugging the
application containing all these .o
files.
This is done by defining the environment variable
CXX_DEBUG_INFORMATION_CACHE
to be the full file-name of an empty file before invoking your make script,
as in the example below:
setenv CXX_DEBUG_INFORMATION_CACHE $TMPDIR/136132834732823953.txt
rm -f $CXX_DEBUG_INFORMATION_CACHE
touch $CXX_DEBUG_INFORMATION_CACHE
make -f your_make_script
rm -f $CXX_DEBUG_INFORMATION_CACHE
unsetenv CXX_DEBUG_INFORMATION_CACHE
CAUTION: C++ V6.0 and V6.1 tend to eliminate too many classes with this option. As soon as they have seen the incomplete class, they add it to the cache, and forget to put out the complete class EVER. The C++ V6.2 compiler has the fix for this problem.
This capability may reduce the size of the .o
and
.so
files in your application to the point where
the Ladebug debugger can handle it easily. Feedback on this
feature would be useful to us. If you find that your application size
either decreases greatly or fails to decrease,
please tell us so we can
help the compiler group refine their approach.
Apart from these approaches, the only other approach we can suggest is
compiling most of your sources -g0
, and compiling just a
small subset -gall
.
If you do a where
and the stack seems to be missing routines,
you may be seeing the result of a compiler optimization called
"tail calls". That optimization works like this:
If a procedure MIDDLE
calls a procedure INNER
just
before returning and certain conditions are met, then MIDDLE
might simply jump to INNER
, instead of doing a call.
No code has to be generated to do the stack manipuations which
would save MIDDLE
's context and restore it after
the call to INNER
. It's ok to do this because
there's no use of the saved context after the call, because it's
the last thing MIDDLE
does.
After the call, INNER
will execute and then
return directly to MIDDLE
's caller, OUTER
, and
there will be no record of MIDDLE
's existence on the stack.
INNER
and do where
,
you will see INNER
and OUTER
, but not
MIDDLE
.
Since this transformation can occur more than once, it is possible for several intermediate calls to appear missing from the context stack.
The conditions which permit this optimization include, among others:
MIDDLE
returns a value, then INNER
produces
the value to be returned;
MIDDLE
are used
during the execution of INNER
;
MIDDLE
does not establish any exception handlers.
There's no problem or bug when a routine is missing from the stack due to this optimization, but it can be confusing.
If what you see on a where
is a stack which just
doesn't make sense (i.e. random numbers without any routine names),
then it's likely that your application has gotten lost.
(ladebug) where
>0 0x11ffff7d4
Typically this kind of stack means that your application has lost track of the real stack and real code location, and is now executing random bits of memory, interpreting it as instructions.
If you're coding in C++, one of the most common ways to get a nonsense stack is for your code to try to execute a method on an invalid object. If the object has already been deleted, has not yet been initialized, is not there, or is of a completely different class, then the virtual function table won't be correct, and the application will be treating random memory as the virtual function table and calling a random place.
The ring tool is a good way to track this kind of problem down, as it will let you find out where your application was when it made the call into nowhere.
exec
?
Normally, Ladebug shows you the execution point of your application,
and if you next
or step
from that point,
you'll probably get to the place you expect, unless something else
happens.
There are two cases where this isn't true. One is on start-up,
and the other is after an exec
has been caught (by
use of "set $catchexecs = 1
").
In both cases, Ladebug presents the application to you as though
it were about to execute the first line in main
. In the start-up
case, this is because there is as yet no execution point and it might
as well present you with something you're probably interested in; in the
exec
case, this is because the application is executing
system code to do the work of the exec
, and is not
yet executing the code you wrote, so we treat it like the start-up
case.
When your application is in this state, you may set breakpoints
on initialization routines and constructors for static or global
items and know you'll hit them. This is why Ladebug doesn't
automatically run your application forward to the start of
main
. But Ladebug has next to no information about
the loader, and showing it to you doesn't make much sense, as
that's not a context you can do any meaningful work in. So
Ladebug shows your context as main
.
Unlike the start-up case, in the exec
case,
it is possible to use motion commands since the application is running.
However, the motion specified by next
or
step
may lead to confusing results, as it will
be a next
or step
from the position
in system code. What that code does, and where a motion winds
up stopping may vary from one OS release to another.
Thus for some releases, next
from a caught exec
will stop at __start
while for others it will stop
only when the application terminates. It is safer to place a
breakpoint at a known place (like __start
or
main
) and continue than to use next
or step
after catching an exec
.
The use of a next
or step
command
before the exec
call is also hard to define;
again, it's best to set a breakpoint after the exec
is
caught and continue than to try to follow the system code as it
loads in a new image and transfers control, a process which can
vary in details from release to release.
If you get the following error message when you attempt to call your
function x
, then the function
may not be visible or may have another name:
Symbol "x" is not defined
Try running the application to a point where the function is visible to the current context by language rules. For example, static functions in other files are not visible outside of that file.
Or maybe your symbol is different by the time the C pre-processor is
done. Consider the following example from the SPEC CPU2000 benchmark
253.perlbmk
(a reduced version of the popular
perl
program):
(ladebug) file sv.c
(ladebug) list 3099,3101
3099 I32
3100 sv_cmp(register SV *str1, register SV *str2)
3101 {
(ladebug) call sv_cmp (0x14019a2a0, 0x14019a360)
Symbol "sv_cmp" is not defined.
(ladebug)
Although the original source code plainly says that the name of the
routine is sv_cmp
, this is not the name that the compiler
sees. Find out what the compiler sees by just running the C pre-processor:
% diff sv.c sv.c-commented
3100c3100
< sv_cmp(register SV *str1, register SV *str2)
---
> sv_cmp(register SV *str1, register SV *str2) /* plugh */
% cc -E -C -DSPEC_CPU2000_DUNIX sv.c-commented > tmp.tmp
% grep plugh tmp.tmp
Perl_sv_cmp(register SV *str1, register SV *str2) /* plugh */
%
The above example uses two tricks to make the pre-processor output easier
to interpret:
-C retains comments in the pre-processor output
/* plugh */ is an arbitrary comment inserted to make it easy
to find our place in the pre-processor output
By putting in a plugh
and then grep
-ing for
it, we discover that by the time all the magic is done with perl
's
various .h
files, our routine name has become Perl_sv_cmp
.
And, indeed, Ladebug can call it using that name:
(ladebug) print Perl_sv_cmp(0x14019a2a0,0x14019a360)
0
fork
!
If you catch the child but not the parent, and the parent code tries
to execute a wait
on the child, then the target will get
stuck forever, with no progress being made.
Without the wait
, if the parent process doesn't stop
(no breakpoints or signals) and continues to compute, it can look like
a hang, as you will have wait for the parent process to complete
before you can debug the child.
Ladebug can only focus on one process at a time, and starts out focused on the parent. What's happened is that after the fork is caught, Ladebug is still focused on the parent, which is still running, so Ladebug isn't talking to you; the parent is at the wait call, so it's not talking to you; the child is stopped on the fork, but doesn't have Ladebug focus, so it's not talking to you.
We therefore recommend always stopping the parent as well as the child when debugging programs that fork.
If you think Ladebug is hung, try typing Ctrl/C, which should get the Ladebug
prompt back after stopping the parent. The where
command will
show you whether the parent is in the wait call, and the
show process all
command will show you which process
has Ladebug focus.
For an example, see the section of the manual on debugging programs that fork.
When Ladebug attaches to a running program, it doesn't stop it automatically, so the program keeps running.
To stop it, you can type a control-C or click the Interrupt button on the GUI.
To stop all attached programs automatically, create a
.ladebugrc
file in your home directory containing this line:
set $stoponattach = 1
This will tell Ladebug to stop all programs after attaching.
set
my variables?
Why both set
and assign
?
Ladebug uses set
to change debugger variables
and assign
to change program variables, because
they have different lifespans and different semantics.
If you try to set
a program variable, what you'll
really do is create a new debugger variable with the same name
as the program variable.
When there is a conflict because a program variable and a
debugger variable have the same name, it's undefined which one
Ladebug will find when evaluating the name. The example shows
how a set
creates a new debugger variable rather
than changing the value of the program variable. Note that in
this case the debugger variable i
is hidden by the
program variable i
, but becomes visible when the program
variable disappears!
(ladebug) p i
0
(ladebug) set i = 9
(ladebug) p i
0
(ladebug) unload
Process has exited
(ladebug) p i
9
For this reason, we encourage users to start all debugger variable
names with a dollar-sign ("$
"). None of the
languages supported by Ladebug allows an initial dollar-sign,
so there can be no name conflicts if you follow that convention.
Sometime when you try to watch a variable, Ladebug won't do it:
(ladebug) watch variable i
Unable to take address for i
Warning: Watchpoint not set.
This happens when the variable is in a register, which has no
address. Ladebug implements watch
with page protection,
which works on variables allocated in memory.
There are two ways around this problem:
static
, which will force it to memory. This
may require other code changes to maintain the correct
semantics.
stopi
command. This will check
the contents of the register after each instruction.
stopi
breakpoint in such a way that it is only enabled during the
lifetime of the variable in question:
(ladebug) stopi i
(ladebug) when at 100 { enable 1 }
(ladebug) when at 200 { disable 1 }
Please note that the second solution may give odd results if
the lifetime of the variable extends beyond the area where the
stopi
breakpoint is enabled: any changes in that
part of its lifetime will be announced when the breakpoint is
re-enabled.
Further, if the stopi
breakpoint is enabled outside the
lifetime of the variable, Ladebug will complain because the
reference is now undefined.
In general, Ladebug recommends watch
over
stopi
where possible, for these and other reasons.
print
get a parse error?
state
or thread
?
If you are running into this problem, you are most likely using Ladebug -64 or older. What follows explains when this problem would occur in older Ladebug debuggers and how to get around it.
As the manual mentions, variables with names which are the same as Ladebug keywords will produce parse errors due to the way Ladebug parses commands.
The names in question are the following:
at
if
in
state
thread
with
To print these names, make them expressions by enclosing them in parentheses:
print (state)
thread
, at
, if
,
and in
if and only if they occur in the expression's
in the following commands:
where
expression
stopi
expression
trace
expression
tracei
expression
wheni
expression
where
are wrong!
If you do a where
command and look up the stack
(down the listing) at the reported parameters, you may be surprised to find that
they are shown with different values than you expect them
to have.
Parameters are passed in places defined by the Alpha calling standard (the first six in registers R16, R17, etc.). Compilers generating code for a routine may decide to leave a parameter in a register, or move it to another register or to a location in the stack. If the parameter isn't used after a certain location in the code, the compiler may decide to re-use the register or stack location for another value in the code following that point.
The compiler describes this to the debugger through debug information in the executable. These records tell Ladebug where a variable may be found when the routine is executing within a specified PC range.
When Ladebug does the where
command, it looks up the
stack, and does a virtual unwind,
re-recreating for itself the
context of the routines on the stack. Then it uses
the debug information from the compiler and the routine's
recreated PC value to figure out where the parameter
variables are located so it can extract and print
their values.
There are two ways this can fail without being a bug:
Our advice for debugging is to always debug an un-optimized
executable (-g
) if you can; if you must debug an optimized
executable (compiled -g3 -O<something>
), be aware that
occasionally Ladebug will be confused by an optimization.
Get the latest compilers and Ladebug kits and see if the problem is fixed.
If you find that the problem is not fixed, send mail to Ladebug.Support@hp.com containing the following:
uname -rsv
)
The smaller the reproducer, the more likely you are to have the problem fixed in the near future.