This repository was archived by the owner on Jul 13, 2025. It is now read-only.
forked from tailscale/tailscale
-
Notifications
You must be signed in to change notification settings - Fork 0
Fork Sync: Update from parent repository #36
Open
github-actions
wants to merge
963
commits into
MultiMx:main
Choose a base branch
from
tailscale:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Existing compaction logic seems to have had an assumption that markActiveChain would cover a longer part of the chain than markYoungAUMs. This prevented long, but fresh, chains, from being compacted correctly. Updates tailscale/corp#33537 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We use `tka.AUMHash` in `netmap.NetworkMap`, and we serialise it as JSON in the `/debug/netmap` C2N endpoint. If the binary omits Tailnet Lock support, the debug endpoint returns an error because it's unable to marshal the AUMHash. This patch adds a sentinel value so this marshalling works, and we can use the debug endpoint. Updates #17115 Signed-off-by: Alex Chan <alexc@tailscale.com> Change-Id: I51ec1491a74e9b9f49d1766abd89681049e09ce4
As part of the conn25 work we will want to be able to keep track of a pool of IP Addresses and know which have been used and which have not. Fixes tailscale/corp#34247 Signed-off-by: Fran Bull <fran@tailscale.com>
…co key rotation Adds the ability to rotate discovery keys on running clients, needed for testing upcoming disco key distribution changes. Introduces key.DiscoKey, an atomic container for a disco private key, public key, and the public key's ShortString, replacing the prior separate atomic fields. magicsock.Conn has a new RotateDiscoKey method, and access to this is provided via localapi and a CLI debug command. Note that this implementation is primarily for testing as it stands, and regular use should likely introduce an additional mechanism that allows the old key to be used for some time, to provide a seamless key rotation rather than one that invalidates all sessions. Updates tailscale/corp#34037 Signed-off-by: James Tucker <james@tailscale.com>
…17955) We now embed node information into network flow logs. By default, netlogfmt still prints out using Tailscale IP addresses. Support a "--resolve-addrs=TYPE" flag that can be used to specify resolving IP addresses as node IDs, hostnames, users, or tags. Updates tailscale/corp#33352 Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Updates tailscale/corp#25406 Change-Id: I7832dbe3dce3774bcc831e3111feb75bcc9e021d Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
(trying to get in smaller obvious chunks ahead of later PRs to make them smaller) Updates #17925 Change-Id: I184002001055790484e4792af8ffe2a9a2465b2e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit modifies the helm/static manifest configuration for the k8s-operator to prefer the stable image tag. This avoids making those using static manifests seeing unstable behaviour by default if they do not manually make the change. This is managed for us when using helm but not when generating the static manifests. Updates #10655 Signed-off-by: David Bond <davidsbond93@gmail.com>
Previously a TKA compaction would only run when a node starts, which means a long-running node could use unbounded storage as it accumulates ever-increasing amounts of TKA state. This patch changes TKA so it runs a compaction after every sync. Updates tailscale/corp#33537 Change-Id: I91df887ea0c5a5b00cb6caced85aeffa2a4b24ee Signed-off-by: Alex Chan <alexc@tailscale.com>
Updates tailscale/corp#33537 Signed-off-by: Alex Chan <alexc@tailscale.com>
Our style guide recommends avoiding Latin abbreviations in technical documentation, which includes the CLI help text. This is causing linter issues for the docs site, because this help text is copied into the docs. See http://go/style-guide/kb/language-and-grammar/abbreviations#latin-abbreviations Updates #cleanup Change-Id: I980c28d996466f0503aaaa65127685f4af608039 Signed-off-by: Alex Chan <alexc@tailscale.com>
Updates #cleanup Change-Id: Ib7b497e22c6cdd80578c69cf728d45754e6f909e Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now that we support using an in-memory backend for TKA state (#17946), this function always returns `nil` – we can always support Network Lock. We don't need it any more. Plus, clean up a couple of errant TODOs from that PR. Updates tailscale/corp#33599 Change-Id: Ief93bb9adebb82b9ad1b3e406d1ae9d2fa234877 Signed-off-by: Alex Chan <alexc@tailscale.com>
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
This commit enables user to set service backend to remote destinations, that can be a partial URL or a full URL. The commit also prevents user to set remote destinations on linux system when socket mark is not working. For user on any version of mac extension they can't serve a service either. The socket mark usability is determined by a new local api. Fixes tailscale/corp#24783 Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
This commit modifies the kubernetes operator to use the "stable" version of `k8s-nameserver` by default. Updates: tailscale/corp#19028 Signed-off-by: David Bond <davidsbond93@gmail.com>
From #17842 Updates #cleanup Change-Id: Ie041b50659361b50558d5ec1f557688d09935f7c Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
fixes #17990 The logging for the netns caps is spammy. Log only on changes to the values and don't log Darwin specific stuff on non Darwin clients. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This commit adds the `spec.replicas` field to the `Recorder` custom resource that allows for a highly available deployment of `tsrecorder` within a kubernetes cluster. Many changes were required here as the code hard-coded the assumption of a single replica. This has required a few loops, similar to what we do for the `Connector` resource to create auth and state secrets. It was also required to add a check to remove dangling state and auth secrets should the recorder be scaled down. Updates: #17965 Signed-off-by: David Bond <davidsbond93@gmail.com>
These validations were previously performed in the CLI frontend. There are two motivations for moving these to the local backend: 1. The backend controls synchronization around the relevant state, so only the backend can guarantee many of these validations. 2. Doing these validations in the back-end avoids the need to repeat them across every frontend (e.g. the CLI and tsnet). Updates tailscale/corp#27200 Signed-off-by: Harry Harpham <harry@tailscale.com>
Pick up fixes for https://pkg.go.dev/vuln/GO-2025-4134 Updates #cleanup Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
…pen (#17883) With the introduction of node sealing, store.New fails in some cases due to the TPM device being reset or unavailable. Currently it results in tailscaled crashing at startup, which is not obvious to the user until they check the logs. Instead of crashing tailscaled at startup, start with an in-memory store with a health warning about state initialization and a link to (future) docs on what to do. When this health message is set, also block any login attempts to avoid masking the problem with an ephemeral node registration. Updates #15830 Updates #17654 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This is causing confusing panics in tailscale/corp#34485. We'll keep using the tka.ChonkMem constructor as much as we can, but don't panic if you create a tka.Mem directly -- we know what the sensible thing is. Updates #cleanup Signed-off-by: Alex Chan <alexc@tailscale.com> Change-Id: I49309f5f403fc26ce4f9a6cf0edc8eddf6a6f3a4
…mutex as a publisher As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to acquire a mutex that can also be held by a publisher. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #17973 Signed-off-by: Nick Khyl <nickk@tailscale.com>
As of 2025-11-20, publishing more events than the eventbus's internal queues can hold may deadlock if a subscriber tries to publish events itself. This commit adds a test that demonstrates this deadlock, and skips it until the bug is fixed. Updates #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates #17996 Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Updates #17830 Signed-off-by: Jordan Whited <jordan@tailscale.com>
…cribers Bounded DeliveredEvent queues reduce memory usage, but they can deadlock under load. Two common scenarios trigger deadlocks when the number of events published in a short period exceeds twice the queue capacity (there's a PublishedEvent queue of the same size): - a subscriber tries to acquire the same mutex as held by a publisher, or - a subscriber for A events publishes B events Avoiding these scenarios is not practical and would limit eventbus usefulness and reduce its adoption, pushing us back to callbacks and other legacy mechanisms. These deadlocks already occurred in customer devices, dev machines, and tests. They also make it harder to identify and fix slow subscribers and similar issues we have been seeing recently. Choosing an arbitrary large fixed queue capacity would only mask the problem. A client running on a sufficiently large and complex customer environment can exceed any meaningful constant limit, since event volume depends on the number of peers and other factors. Behavior also changes based on scheduling of publishers and subscribers by the Go runtime, OS, and hardware, as the issue is essentially a race between publishers and subscribers. Additionally, on lower-end devices, an unreasonably high constant capacity is practically the same as using unbounded queues. Therefore, this PR changes the event queue implementation to be unbounded by default. The PublishedEvent queue keeps its existing capacity of 16 items, while subscribers' DeliveredEvent queues become unbounded. This change fixes known deadlocks and makes the system stable under load, at the cost of higher potential memory usage, including cases where a queue grows during an event burst and does not shrink when load decreases. Further improvements can be implemented in the future as needed. Fixes #17973 Fixes #18012 Signed-off-by: Nick Khyl <nickk@tailscale.com>
Linux kernel versions 6.6.102-104 and 6.12.42-45 have a regression in /proc/net/tcp that causes seek operations to fail with "illegal seek". This breaks portlist tests on these kernels. Add kernel version detection for Linux systems and a SkipOnKernelVersions helper to tstest. Use it to skip affected portlist tests on the broken kernel versions. Thanks to philiptaron for the list of kernels with the issue and fix. Updates #16966 Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Updates tailscale/go#149 Change-Id: If0483466eb1fc2196838c75f6d53925b1809abff Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
…8568) Not all Linux distros use systemd yet, for example GL.iNet KVM devices use busybox's init, which is similar to SysV init. This is a best-effort restart attempt after the update, it probably won't cover 100% of init.d setups out there. Fixes #18567 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
…8356) When the NodeAttrDNSSubdomainResolve capability is present, enable wildcard certificate issuance to cover all single-level subdomains of a node's CertDomain. Without the capability, only exact CertDomain matches are allowed, so node.ts.net yields a cert for node.ts.net. With the capability, we now generate wildcard certificates. Wildcard certs include both the wildcard and base domain in their SANs, and ACME authorization requests both identifiers. The cert filenames are kept still based on the base domain with the wildcard prefix stripped, so we aren't creating separate files. DNS challenges still used the base domain The checkCertDomain function is replaced by resolveCertDomain that both validates and returns the appropriate cert domain to request. Name validation is now moved earlier into GetCertPEMWithValidity() Fixes #1196 Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
This resolves a gap in test coverage, ensuring Server.ListenService functions as expected in combination with user-supplied TUN devices Fixes tailscale/corp#36603 Co-authored-by: Harry Harpham <harry@tailscale.com> Signed-off-by: Harry Harpham <harry@tailscale.com>
We already had a featuretag for clientupdate, but the CLI wasn't using it, making the "minbox" build (minimal combined tailscaled + CLI build) larger than necessary. Updates #12614 Change-Id: Idd7546c67dece7078f25b8f2ae9886f58d599002 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
…uretags Package feature/conn25 is excludeable from a build via the featuretag. Test it is excluded for minimal builds. Updates #12614 Signed-off-by: Fran Bull <fran@tailscale.com>
Updates #12614 Change-Id: I49351fe0c463af0b8d940e8088d4748906a8aec3 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Use the parsed and validated advertise tags value from prefs instead of doing a strings.Split on the raw tags value as an input to the OAuth and identity federation auth key generation methods. The previous strings.Split method would return an array with a single empty string element which would pass downstream length checks on the tags argument before eventually failing with a confusing message when hitting the API. Fixes #18617 Signed-off-by: Mario Minardi <mario@tailscale.com>
If any profiles exist and an Authkey is provided via syspolicy, the AuthKey is ignored on backend start, preventing re-auth attempts. This is useful for one-time device provisioning scenarios, skipping authKey use after initial setup when the authKey may no longer be valid. updates #18618 Signed-off-by: Will Hannah <willh@tailscale.com>
Currently the expvar exporter attempts to write expvar.String, which breaks the Prometheus metric page. Updates tailscale/corp#36552 Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Updates #cleanup Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Under extremely high load it appears we may have some retention issues as a result of queue depth build up, but there is currently no direct way to observe this. The scenario does not trigger the slow subscriber log message, and the event stream debugging endpoint produces a saturating volume of information. Updates tailscale/corp#36904 Signed-off-by: James Tucker <james@tailscale.com>
Updates #18629 Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This commit adds a bool named PeerRelay to Hostinfo, to identify the host's status of acting as a peer relay. Considering the RelayServerPort number can be 0, I just made this a bool in stead of a port number. If the port info is needed in future this would also help indicating if the port was set to 0 (meaning any port in peer relay context). Updates tailscale/corp#35862 Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
… omittable Add new "webbrowser" and "colorable" feature tags so that the github.com/toqueteos/webbrowser and mattn/go-colorable packages can be excluded from minbox builds. Updates #12614 Change-Id: Iabd38b242f5a56aa10ef2050113785283f4e1fe8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This updates the URL shown by systemd to the new URL used by the docs after the recent migration. Fixes #18646 Signed-off-by: Tim Walters <tim@tailscale.com>
wiki.nixos.org is and has been the official wiki for quite some time now. Signed-off-by: faukah <fau@faukah.com>
bart has gained a bunch of purported performance and usability improvements since the current version we are using (0.18.0, from 1y ago) Updates tailscale/corp#36982 Signed-off-by: Amal Bansode <amal@tailscale.com>
app connector packets We introduce the Conn25PacketHooks interface to be used as a nil-able field in userspaceEngine. The engine then plumbs through the functions to the corresponding tstun.Wrapper intercepts. The new intercepts run pre-filter when egressing toward WireGuard, and post-filter when ingressing from WireGuard. This is preserve the design invariant that the filter recognizes the traffic as interesting app connector traffic. This commit does not plumb through implementation of the interface, so should be a functional no-op. Fixes tailscale/corp#35985 Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
Fixes #18118 Change-Id: I118fcc6537af9ccbdc7ce6b78134e8059b0b5ccf Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
-Wait does not just wait for the created process; it waits for the entire process tree rooted at that process! This can cause the shell to wait indefinitely if something in that tree fired up any background processes. Instead we call WaitForExit on the returned process. Updates tailscale/corp#29940 Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Fixes #18631 Signed-off-by: Becky Pauley <becky@tailscale.com>
…-dns is false (#18572) fixes #18436 Queries can still make their way to the forwarder when accept-dns is disabled. Since we have not configured the forwarder if --accept-dns is false, this errors out (correctly) but it also generates a persistent health warning. This forwards the Pref setting all the way through the stack to the forwarder so that we can be more judicious about when we decide that the forward path is unintentionally missing, vs simply not configured. Testing: tailscale set --accept-dns=false. (or from the GUI) dig @100.100.100.100 example.com tailscale status No dns related health warnings should be surfaced. Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
…e Synchronize hack Restore synchronous method calls from LocalBackend to magicsock.Conn for node views, filter, and delta mutations. The eventbus delivery introduced in 8e6f63c was invalid for these updates because subsequent operations in the same call chain depend on magicsock already having the current state. The Synchronize/settleEventBus workaround was fragile and kept requiring more workarounds and introducing new mystery bugs. Since eventbus was added, we've since learned more about when to use eventbus, and this wasn't one of the cases. We can take another swing at using eventbus for netmap changes in a future change. Fixes #16369 Updates #18575 (likely fixes) Change-Id: I79057cc9259993368bb1e350ff0e073adf6b9a8f Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates rotateLocked so that we hold the activeStderrWriteForTest write lock around the dup2Stderr call, rather than acquiring it only after dup2 was already compelete. This ensures no stderrWriteForTest calls can race with the dup2 syscall. The now unused waitIdleStderrForTest has been removed. On macOS, dup2 and write on the same file descriptor are not atomic with respect to each other, when rotateLocked called dup2Stderr to redirect the stderr fd to a new file, concurrent goroutines calling stderrWriteForTest could observe the fd in a transiently invalid state, resulting in the bad file descripter. Fixes tailscale/corp#36953 Signed-off-by: James Scott <jim@tailscale.com>
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
…18681) When traffic steering is enabled, some users are suggested an exit node that is inappropriately far from their location. This seems to happen right when the client connects to the control plane and the client eventually fixes itself. But whenever an affected client reconnects, its suggested exit node flaps, and this happens often enough to be noticeable because connections drop whenever the exit node is switched. This should not happen, since the map response that contains the list of suggested exit nodes that the client picks from, also contains the scores for those nodes. Since our current logging and diagnostic tools don’t give us enough insight into what is happening, this PR adds additional logging when: - traffic steering scores are used to suggest an exit node - an exit node is suggested, no matter how it was determined Updates: tailscale/corp#29964 Updates: tailscale/corp#36446 Signed-off-by: Simon Law <sfllaw@tailscale.com>
This updates the TS_GO_NEXT=1 (testing) toolchain to Go 1.26.0 The default one is still Go 1.25.x. Updates #18682 Change-Id: I99747798c166ce162ee9eee74baa9ff6744a62f6 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.