Boundary on OpenBSD anyone?

Can anyone confirm that Boundary v0.4.x or 0.5.x actually work on OpenBSD-6.9 amd64, using the binaries from releases.hashicorp.com, e.g.
for ex. /boundary/0.5.1/boundary_0.5.1_openbsd_amd64.zip?

I have a reference installation of Boundary v0.4.0 that works perfectly well on FreeBSD 12.1 amd64 (actually OPNsense/HardenedBSD) which uses PostgreSQL 12.

After moving and adapting the Boundary configurations from that FreeBSD setup to OpenBSD-6.9, and dumping its PostgreSQL 12 database and restoring it to PostgreSQL 13 on OpenBSD, the Boundary Controller appears to startup fine, without showing any errors in the event logs.
Also, it answers TCP connects from Web browsers to its WebUI at port 9200 with Syn/Ack, but never serves Boundary’s login page. Unfortunately, I can not get any logs from Boundary yet in order to figure out whats wrong.
Of course, CORS is a prime suspect, but I have checked various settings, among them the tried and tested one from the working FreeBSD setup.

Further, I have rendered the new PostgreSQL 13 database on OpenBSD accessible to the working Boundary Controller on FreeBSD 12, and that works fine too. So this indicates that Boundary on FreeBSD works well using the new PostgreSQL 13 database on OpenBSD.

Vice-versa, I have also tested Boundary Controller on OpenBSD giving it access to the known-good Postgres 12 database on FreeBSD. From looking at tcpdump on port 5432/tcp, Boundary appears to access Postgresql fine, but produces still no HMTL output on its WebUI at port 9200/tcp.

Migrating from Boundary 0.4.0 to 0.5.0 and now 0.5.1, as well as re-initialisation of Postgresql & Boundary databases did not help on OpenBSD either.

Comparing the file sizes of the binaries for FreeBSD and OpenBSD of Boundary, it looks as if the code necessary for the WebUI is indeed included in the OpenBSD binary as well.
Also, I have not found any indication that the WebUI must be enabled, unlike in Consul or Vault for example.

Thank you for any hints and clue sticks!

Update: Attempts to login via CLi to localhost:9200 timeout, although Boundary is listening on 127.0.0.1:9200/tcp. Which may indicate that this is rather not an issue with the WebUI or the built-in HTTP server, but rather by some other subsystem in Boundary’s binary for OpenBSD:

[r@gf:~]$ doas -u _boundary boundary authenticate password \
 -keyring-type=none \
 -auth-method-id=ampw_8... \
 -login-name=admin \
 -password=Xi...
Error trying to perform authentication: error performing client request during Authenticate call: context deadline exceeded

Still only a handful records in the log files from the moment when aborting the Boundary Controller which I run in the foreground using Ctrl-C, despite writing all enabled Observations & Sysevents using event_type=["*"] to a file.

Update 2: From main/scripts/build.sh , I understand that HashiCorp’s build system generates binary releases for OpenBSD by cross-compiling using OSARCH=openbsd/amd64.

On a stable OpenBSD-6.9 amd64 host

[r@gf:boundary]$ uname -a
OpenBSD gf 6.9 GENERIC.MP#3 amd64

native builds of Boundary v0.5.1 from source fail in two steps:

  1. ‘gmake tools’ stops with
[r@gf:boundary]$ gmake tools
go generate -tags tools tools/tools.go
go: downloading github.com/favadi/protoc-go-inject-tag v1.1.0
go: downloading github.com/bufbuild/buf v0.37.0
...
go: downloading github.com/klauspost/pgzip v1.2.5
go: downloading golang.org/x/net v0.0.0-20210510120150-4163338589ed
go: downloading github.com/shurcooL/sanitized_anchor_name v1.0.0
# github.com/bufbuild/buf/internal/pkg/interrupt
/home/rs/go/pkg/mod/github.com/bufbuild/buf@v0.37.0/internal/pkg/interrupt/interrupt.go:40:25: undefined: signals
tools/tools.go:17: running "go": exit status 2
gmake: *** [Makefile:22: tools] Error 1
[r@gf:boundary]$ 
  1. ‘gmake bin’ is not happy about:
[r@gf:boundary]$ gmake tools
node-pre-gyp ERR! System OpenBSD 6.9
...
node-pre-gyp ERR! node -v v12.16.1
node-pre-gyp ERR! node-pre-gyp -v v0.11.0
...
Please ensure Node v14+ and Yarn v1.22.10+ are installed.

Now, I will setup a fresh build host using OpenBSD-current from Snapshots which has the required dependencies:

 https://cdn.openbsd.org/pub/OpenBSD/snapshots/packages/amd64/
node-12.16.1p2.tgz    
yarn-1.22.11.tgz  

Then, I want to retry the build process and will report. Thanks.

That’s an old version of buf but buf isn’t compiled info the main binary. You actually probably do not need to run ‘make tools’ at all. It’s mostly there to get specific versions of specific tools needed for dev work into the toolchain.

It should be sufficient for you to run just ‘make dev’. You’ll still need the dependencies to run the UI build step though. I could probably put in a makefile target that doesn’t require that if you need (or edit the makefile for the dev target and remove the two UI related lines).

Thanks Jeff for your hint. After commenting in the Makefile of the 0.5.1 branch just the line

#dev: BUILD_TAGS+=ui

With that change, ‘$ gmake dev’ produced a binary with the same Git Revision tag as the binary from the release, but only 45.8 MB in size, compared to 53.7 MB the release.

I guess this is the missing UI component which I still need to get building. Have to check my notes from last month when I managed to build the UI component of Nomad whose port on OpenBSD comes without UI at the moment (intending to submit PR later to change that).

Currently, I am struggling to get the dependency Node.js 16.7.0 to build on OpenBSD-current/snapshot (which has a port of 12.16.1 stable only). But failed using gcc, retrying with clang now…

You could test whether the binary works though without the UI, by using the CLI, before spending the effort on Node.

Also the “dev” version without UI freshly built from source tagged 0.5.1 does not reply to any REST API calls, e.g. the server/controller accepts TCP connections, but never replies, for example auth requests timeout. Although tcpdump on lo0 shows that the controller is periodically polling Postgresql on localhost:5432/tcp.

Most irritating is that there is no log output, neither on the console while running the controller in the foreground, nor in any log files, regardless of BOUNDARY_DEVELOPER_ENABLE_EVENTS=true and my first attempts of events configuration.

In the meantime, I managed to compile Node-14.17.5 LTS using clang that comes with OpenBSD-current/snapsot. Only the last step still fails while the build links the objects to an executable (linker library -ldl missing).

But next, as you suggested, and from prior experience while getting Consul and Vault to work on OpenBSD, I widen the search to things like increasing limits such as the number of open files in the login class of the _boundary user which starts boundary as daemon, etc.

As ‘gmake dev’ without UI is quite quick even on low-power APU boards from PC Engines, I might sprinkle some of my own print statements in the source which might help to track down my issue which appears to be specific to OpenBSD, such as permissions, limits, errors in my boundary configuration, etc.
Or, would you recommend to resort to a debugger, such as GDB or Delve? Thanks.

If you have absolutely no output at all on either stdout or stderr then it seems things are failing after the listeners are set up but before we get to the end of the setup process. Can you try a SIGQUIT to get a stack trace?

Thanks, SIGQUIT dumps a nice stack trace using the released 0.5.1.
Would you find a moment to take a quick look at it as well? I could put it up to Gist for ex., and also send you my configuration file (sanitized).

The output of Goroutine 0 below makes me wonder if uses the runtime from a previous Go installed locally at /usr/local/go/src/… for some reason, and not from the statically linked Boundary release binary:

SIGQUIT: quit
PC=0x2a3e96dca m=7 sigcode=0

goroutine 0 [idle]:
runtime.kevent(0x200000003, 0x0, 0x200000000, 0x284a62c30, 0x40, 0x284a62c08, 0x0)
        /usr/local/go/src/runtime/sys_openbsd2.go:178 +0x39
runtime.netpoll(0x215743c47, 0x0)
        /usr/local/go/src/runtime/netpoll_kqueue.go:127 +0xae
runtime.findrunnable(0xc00004e800, 0x0)
        /usr/local/go/src/runtime/proc.go:2923 +0x3ee
runtime.schedule()
        /usr/local/go/src/runtime/proc.go:3169 +0x2d7
runtime.goexit0(0xc0003c8300)
        /usr/local/go/src/runtime/proc.go:3478 +0x1de
runtime.mcall(0x80000)
        /usr/local/go/src/runtime/asm_amd64.s:327 +0x5b

goroutine 1 [select, 5 minutes]:
github.com/hashicorp/boundary/internal/cmd/commands/server.(*Command).WaitForInterrupt(0xc000566000, 0x0)
        /go/internal/cmd/commands/server/server.go:599 +0xcb
github.com/hashicorp/boundary/internal/cmd/commands/server.(*Command).Run(0xc000566000, 0xc00003a0c0, 0x3, 0x3, 0x0)
        /go/internal/cmd/commands/server/server.go:480 +0x179a
github.com/mitchellh/cli.(*CLI).Run(0xc000672000, 0xc000672000, 0xc00030d158, 0xc000664080)
        /root/go/pkg/mod/github.com/mitchellh/cli@v1.1.2/cli.go:262 +0x41a
github.com/hashicorp/boundary/internal/cmd.RunCustom(0xc00003a0b0, 0x4, 0x4, 0xc000839e60, 0xc00003e0b8)
        /go/internal/cmd/main.go:186 +0x846
github.com/hashicorp/boundary/internal/cmd.Run(...)
        /go/internal/cmd/main.go:92
main.main()
        /go/cmd/boundary/main.go:13 +0xda
...

Although:

[rs@greif:~]$ /usr/local/bin/boundary version     

Version information:
  Git Revision:        5f88243ddc6182db9c71ba84fd401040de4f5d41
  Version Number:      0.5.1

/usr/local/bin/boundary: ELF 64-bit LSB executable, x86-64, version 1
[rs@greif:~]$ ldd /usr/local/bin/boundary  
/usr/local/bin/boundary:
        Start            End              Type  Open Ref GrpRef Name
        0000000000400000 0000000002e14000 exe   2    0   0      /usr/local/bin/boundary
        0000000231fc2000 00000002320b6000 rlib  0    1   0      /usr/lib/libc.so.96.0
        000000025b5c3000 000000025b5cf000 rlib  0    1   0      /usr/lib/libpthread.so.26.1
        000000020482b000 000000020482b000 ld.so 0    1   0      /usr/libexec/ld.so
[rs@greif:~]$ 

Compare that to the binary of the previous version 0.4.0 for FreeBSD-amd64 which runs fine on OPNsense/HardenedBSD 12.1:

root@hydra:~ # uname -a
FreeBSD hydra 12.1-RELEASE-p11-HBSD FreeBSD 12.1-RELEASE-p11-HBSD #0  74f1f081a1e(stable/20.7)-dirty: Fri Dec  4 13:40:15 CET 2020     root@sensey64:/usr/obj/usr/src/amd64.amd64/sys/SMP  amd64
root@hydra:~ # which boundary
/usr/local/bin/boundary
root@hydra:~ # /usr/local/bin/boundary version

Version information:
  Git Revision:        0b66464a3a173d5cd28a41924fb661d9e68b33c5
  Version Number:      0.4.0

root@hydra:~ # file /usr/local/bin/boundary
/usr/local/bin/boundary: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), statically linked, Go BuildID=d5v2bRWTq2Dz5iQ6OHj8/MIi_YWd0ctXqTYFwMdWa/ch_xEwBGDqtSEHQkV6xq/vO4-2FYi6CB-w3axZl-H, not stripped
root@hydra:~ # ldd /usr/local/bin/boundary
ldd: /usr/local/bin/boundary: not a dynamic ELF executable
root@hydra:~ # 

So, the OpenBSD release was not yet linked statically, and that might actually be the issue. Apparently, it dynamically loads & links a Go runtime which is laying around on my OpenBSD system locally?

This appears to be “by design” - slowly catching up with recent Go on OpenBSD.
Go had to adapt as OpenBSD does not allow kernel calls anymore without passing through its libc, see for ex. all: stop using direct syscalls on OpenBSD #36435 or Go 1.16 will make system calls through Libc on OpenBSD

Will have to go down that rabbit hole, and how to debug why Boundary seems to get stuck, at least (parts of) its Controller…

Also with statically linked binaries, at least the API listeners of the 70 goroutines get stuck.

The Go 1.16 Release Notes imply that OpenBSD 6.9+ can still run static Go binaries.

The following “hack” of the LD_FLAGS of gox in scripts/build.sh “forces external linking” and produces static binaries for Boundary 0.5.1, tested on both OpenBSD-amd64 6.9 with Go 1.16.2 and 7.0-current snapshot with Go 1.16.6:

-   -ldflags "${LD_FLAGS}-X github.com/...
+   -ldflags "-linkmode external -extldflags '-fno-PIC -static' -X github.com/...

So the debugging continues, any suggestions?

Well TIL about OpenBSD behavior with Go changing in 1.16. :neutral_face:

Are those the only two goroutines you got on SIGQUIT? At that point I’d expect there to be many, many more. The only thing I can see from there is that it seems to have completed startup.

This seems like it’s not in dev mode, any luck with dev mode compared to starting as boundary server?

@jeff SIGQUIT caused 70 goroutines to be dump in total. The two shown above are only those on top listed in the output.

AFAIK the *BSDs with the exception MacOS do not support Docker containers. Therefore, Boundary refuses to start in dev mode.

Today, checked out some parameters at runtime using environment variables, such as GODEBUG=“inittrace=1,scheddetail=1000” which shows 8 or 9 threads of which 4 are idle. This looks very similar to the know-good reference on FreeBSD.
Also, I started to study the sources where I should set break- and watchpoints in GDB in order to try and figure out why the API listener does not respond to any requests at all, and also does not output any logs/events, etc.

Unfortunately, Delve has not been fixed yet on OpenBSD. But I got ncurses-based cgdb with egdb (the new gdb on OpenBSD) up and running with Go Debugging including Go runtime support loaded (also for dynamically loaded & linked Go binaries), which allows looking at source, goroutines, etc.

Tomorrow, I should give the Profiling and Execution Tracer as outlined in Go’s Diagnostics page another try. They might teach me the general program flow and eventually reveal spots that deserve closer inspection.

Do you have suggestions for watch- and breakpoints to check which might reveal why Boundary is “stuck” without any apparent output? Actually it’s goroutines seem to just sitting mostly idle while waiting for events to happen and to take action, besides polling PostgreSQL periodically…

Hi there,

You don’t have to use Docker with dev mode – if you pass -database-url to boundary dev you can point it at a bare postgres database.

If you let me know what error message you see when trying in dev mode without that flag, I can see where things are failing and try to make the error message better (have it suggest the above flag).

Thank you Jeff, good to know and very interesting: dev mode with that flag
$ doas -u _boundary boundary dev -database-url=postgresql://user:pwd@localhost:5432/boundary?sslmode=disable returns
Error connecting to database: error creating global scope kms keys: kms.CreateKeysTx: unable to create root key in scope global: kms.createRootKeyTx: root keys: db.Create: create failed: duplicate key value violates unique constraint "kms_root_key_scope_id_key": unique constraint violation: integrity violation: error #1002.

Without that flag -database-url it returns
Error creating dev database container docker is not currently supported on this platform.

Note that you can’t use dev mode against an existing DB – that’s why you’re seeing those errors. Even running dev mode twice, you currently need a fresh DB.

We thought about changing that, and may at some point, but if you’re running dev mode against the same DB over and over you might as well just not run in dev mode, so we haven’t changed it yet :slight_smile:

OK, understood. Here is the output of Boundary 0.5.1 release in dev mode on OpenBSD-6.9 amd64. Also, see the output of SIGQUIT at the end of this file Boundary_DevMode_OpenBSD.txt (229.8 KB).

It looks like cluster controller at localhost:9201 fails to respond after successful initial connection?

I intend retry this after instrumenting Boundary with runtime/trace and net/http/pprof in order to syphon traces from the running Boundary dev instance for analysis using go tool trace ..., etc.

There are all sorts of strange things going on there. The worker connects to the controller but then all RPCs to the worker fail. The eventer appears at one point to fail to write to stderr(!).

I don’t really know what’s going on but I’m curious what your tracing turns up…