Commit Graph

153 Commits

Author SHA1 Message Date
Howard Zhang 253f180c69 Fix comments error
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-03-25 17:39:45 +08:00
gVisor bot 8ee4a3f6d0 Merge pull request #5677 from avagin:kvm-mmio
PiperOrigin-RevId: 364728696
2021-03-23 22:50:14 -07:00
Andrei Vagin 56a9a13976 Move the code that manages floating-point state to a separate package
This change is inspired by Adin's cl/355256448.

PiperOrigin-RevId: 364695931
2021-03-23 18:46:37 -07:00
Andrei Vagin 2f3dac78ca kvm: prefault a floating point state before restoring it
If physical pages of a memory region are not mapped yet, the kernel will
trigger KVM_EXIT_MMIO and we will map physical pages in bluepillHandler().

An instruction that triggered a fault will not be re-executed, it
will be emulated in the kernel, but it can't  emulate complex
instructions like xsave, xrstor. We can touch the memory with
simple instructions to workaround this problem.
2021-03-16 21:55:20 -07:00
Ayush Ranjan a9441aea27 [op] Replace syscall package usage with golang.org/x/sys/unix in pkg/.
The syscall package has been deprecated in favor of golang.org/x/sys.

Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
  are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
  returns it and not unix.Stat_t.

Updates #214

PiperOrigin-RevId: 360701387
2021-03-03 10:25:58 -08:00
Michael Pratt f80a857a4f Bump build constraints to Go 1.18
These are bumped to allow early testing of Go 1.17. Use will be audited closer
to the 1.17 release.

PiperOrigin-RevId: 358278615
2021-02-18 15:30:58 -08:00
Robin Luk 6eb80b2e2d arm64 kvm:implement basic lazy save and restore for FPSIMD registers
Implement basic lazy save and restore for FPSIMD registers, which only
  restore FPSIMD state on el0_fpsimd_acc and save FPSIMD state in switch().

Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2021-02-03 11:50:36 +00:00
Adin Scannell f884ea13b7 Move ring0 package.
This allows the package to serve as a general purpose ring0 support package, as
opposed to being bound to specific sentry platforms.

Updates #5039

PiperOrigin-RevId: 355220044
2021-02-02 12:03:26 -08:00
gVisor bot 64bff178b8 Merge pull request #4792 from lubinszARM:pr_kvm_test
PiperOrigin-RevId: 351638451
2021-01-13 12:12:26 -08:00
gVisor bot 70de1db82e Merge pull request #4933 from lubinszARM:pr_kvm_el0_exceptions
PiperOrigin-RevId: 350862699
2021-01-08 17:08:36 -08:00
Robin Luk 7e91b3cdec arm64 kvm: revert some kpti related codes, and configure upper pagetable as global
In order to improve the performance, some kpti related codes(TCR.A1) have
been reverted, and set kernel pagetable as global.

Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2020-12-29 19:35:17 +08:00
Robin Luk 3868c7dd40 arm64 kvm: add more handling of el0_exceptions
Add more comments and more handling for exceptions.

Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2020-11-25 14:36:41 +08:00
Robin Luk 6a85d13ccf arm64 kvm: add to ext_dabt injection support
If no vild syndrome(data abort outside memslots) was reported by kvm, let userspace to do the
ext_dabt injection to bail out this issue.

Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2020-11-23 16:47:19 +08:00
Robin Luk 170b584222 arm64 kvm: add the processing functions for all el0/el1 exceptions
I added 2 unified processing functions for all exceptions of el/el0

Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2020-11-17 14:54:33 +08:00
Robin Luk b7de12fc03 kvm-test: adjust the check logic in TestWrongVCPU case
Signed-off-by: Robin Luk <lubin.lu@alibaba-inc.com>
2020-11-12 03:52:37 +00:00
gVisor bot 861c11bfa7 Merge pull request #3617 from laijs:upperhalf
PiperOrigin-RevId: 340484823
2020-11-03 11:19:04 -08:00
lubinszARM 0e96f8065e arm64 kvm: inject sError to trigger sigbus
Use an sErr injection to trigger sigbus when we receive EFAULT from the
run ioctl.

After applying this patch, mmap_test_runsc_kvm will be passed on
Arm64.

Signed-off-by: Bin Lu <bin.lu@arm.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/4542 from lubinszARM:pr_kvm_mmap_1 f81bd42466d1d60a581e5fb34de18b78878c68c1
PiperOrigin-RevId: 340461239
2020-11-03 09:34:39 -08:00
Lai Jiangshan 3425485b7c kvm: share upper halves among all pagtables
Fixes: #509

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-11-03 00:10:32 +08:00
gVisor bot c94bf137da Merge pull request #4564 from zhlhahaha:1981
PiperOrigin-RevId: 339921446
2020-10-30 12:45:24 -07:00
Bin Lu 56b5c71bac arm64 kvm: added the implementation of setSystemTimeLegacy()
I have added support for setSystemTimeLegacy() by setting cntvoff.

With this pr, TestRdtsc and other kvm syscall test cases(nanosleep,
wait...) can be passed on Arm64.

TO-DO: Add precise synchronization to KVM for Arm64.
Reference PR: https://github.com/google/gvisor/pull/4397

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-10-22 01:46:09 -04:00
gVisor bot 1b2097f84e Merge pull request #4535 from lubinszARM:pr_kvm_exec_binary_1
PiperOrigin-RevId: 338321125
2020-10-21 12:53:11 -07:00
Howard Zhang d7ea53769f ARM64 KVM: bad regs.Sp return SIGSEGV
Consistent with the linux kernel, bad regs.Sp
return SIGSEGV

Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2020-10-20 15:50:09 +08:00
Bin Lu 3b735c8fec arm64 kvm: handle exception from accessing undefined instruction
Consistent with the linux approach, we will produce a sigill to handle
el0_undef.

After applying this patch, exec_binary_test_runsc_kvm will be passed on
Arm64.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-10-18 21:47:12 -04:00
gVisor bot b491712e11 Merge pull request #4387 from lubinszARM:pr_tls_host_sentry_1
PiperOrigin-RevId: 337544656
2020-10-16 11:32:38 -07:00
gVisor bot dbe122c92f Merge pull request #4386 from lubinszARM:pr_testutil_tls_usr
PiperOrigin-RevId: 336970511
2020-10-13 15:42:24 -07:00
Adin Scannell d9b32efb30 Avoid excessive Tgkill and wait operations.
The required states may simply not be observed by the thread running bounce, so
track guest and user generations to ensure that at least one of the desired
state transitions happens.

Fixes #3532

PiperOrigin-RevId: 336908216
2020-10-13 10:43:45 -07:00
gVisor bot 93bc0777be Merge pull request #4072 from adamliyi:droppt_fix
PiperOrigin-RevId: 336719900
2020-10-12 12:34:43 -07:00
Bin Lu 1557153cad arm64 kvm: add tls-usr support
The tls of guest-el1-sentry and host-el0-sentry may be different on Arm64.
I added a solution for it.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-10-11 23:32:54 -04:00
gVisor bot 6df400dfb6 Merge pull request #4040 from lemin9538:lemin_arm64
PiperOrigin-RevId: 336362818
2020-10-09 14:14:03 -07:00
Min Le 190cf30e41 arm64: the mair_el1 value is wrong
the correct value needed is 0xbbff440c0400 but the const
defined is 0x000000000000ffc0 due to the operator error
in _MT_EL1_INIT, both kernel and user space memory
attribute should be Normal memory not DEVICE_nGnRE

Signed-off-by: Min Le <lemin.lm@antgroup.com>
2020-10-08 20:33:09 +08:00
Adin Scannell ecf9a7ef09 Add precise synchronization to KVM.
By using TSC scaling as a hack, we can trick the kernel into setting an offset
of exactly zero. Huzzah!

PiperOrigin-RevId: 335922019
2020-10-07 12:08:09 -07:00
Jamie Liu 1336af78d5 Implement membarrier(2) commands other than *_SYNC_CORE.
Updates #267

PiperOrigin-RevId: 335713923
2020-10-06 13:55:16 -07:00
Andrei Vagin de85b045d4 kvm/x86: handle a case when interrupts are enabled in the kernel space
Before we thought that interrupts are always disabled in the kernel
space, but here is a case when goruntime switches on a goroutine which
has been saved in the host mode. On restore, the popf instruction is
used to restore flags and this means that all flags what the goroutine
has in the host mode will be restored in the kernel mode. And in the
host mode, interrupts are always enabled.

The long story short, we can't use the IF flag for determine whether a
tasks is running in user or kernel mode.

This patch reworks the code so that in userspace, the first bit of the
IOPL flag will be always set. This doesn't give any new privilidges for
a task because CPL in userspace is always 3. But then we can use this
flag to distinguish user and kernel modes. The IOPL flag is never set in
the kernel and host modes.

Reported-by: syzbot+5036b325a8eb15c030cf@syzkaller.appspotmail.com
Reported-by: syzbot+034d580e89ad67b8dc75@syzkaller.appspotmail.com
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-10-02 13:16:58 -07:00
Yi Li bc14050ebf arm64 kvm: fix panic in kvm.dropPageTables
Related with issue #3019, #4056.
When running hello-world with gvisor-kvm, there is panic when exits:

"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x3c0 pc=0x7c3f18]

goroutine 284 [running]:
... ...
gvisor.dev/gvisor/pkg/sentry/platform/kvm.(*machine).dropPageTables(0x4000166840, 0x400032a040)
    pkg/sentry/platform/kvm/machine_arm64.go:111 +0x88 fp=0x4000479e00 sp=0x4000479da0 pc=0x7c3f18
"

Also make dropPageTables() arch independent.
2020-09-30 21:49:42 +00:00
gVisor bot 9751044a96 Merge pull request #2256 from laijs:kpti
PiperOrigin-RevId: 334674481
2020-09-30 14:07:43 -07:00
Bin Lu 4d421f58fe arm64 kvm: add a test case for kernel-tls checking
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-30 11:16:59 +08:00
Bin Lu 8a5af9a08d arm64 mm: asid and tlb support
Some optimizations in this pr:
  1, Move ASID from TTBR0 to TTBR1
  2, tlb_flush_all

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-11 04:44:26 -04:00
gVisor bot a4b1c6f5a4 Merge pull request #3742 from lubinszARM:pr_n1_1
PiperOrigin-RevId: 328639254
2020-08-26 17:10:16 -07:00
Adin Scannell 983a55aa06 Support stdlib analyzers with nogo.
This immediately revealed an escape analysis violation (!), where
the sync.Map was being used in a context that escapes were not
allowed. This is a relatively minor fix and is included.

PiperOrigin-RevId: 328611237
2020-08-26 14:42:35 -07:00
Bin Lu 57bfbed1d6 Device major number greater than 2 digits in /proc/self/maps on arm64 N1 machine
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-08-24 22:41:01 -04:00
Michael Pratt ab6c474210 Bump build constraints to 1.17
This enables pre-release testing with 1.16. The intention is to replace these
with a nogo check before the next release.

PiperOrigin-RevId: 328193911
2020-08-24 12:58:39 -07:00
Bin Lu 05d742ede4 Running hello-world on Thunderx2 with kvm
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-08-12 05:37:27 -04:00
Andrei Vagin 13a8ae81b2 Add context.FullStateChanged()
It indicates that the Sentry has changed the state of the thread and
next calls of PullFullState() has to do nothing.

PiperOrigin-RevId: 325567415
2020-08-07 22:49:55 -07:00
gVisor bot 8f6d576afe Merge pull request #3069 from lubinszARM:pr_serr_injection2
PiperOrigin-RevId: 325546308
2020-08-07 18:32:25 -07:00
Lai Jiangshan 9cae407b27 amd64: implement KPTI for gvisor
Actually, gvisor has KPTI (Kernel PageTable Isolation) between
gr0 and gr3. But the upper half of the userCR3 contains the
whole sentry kernel which makes the kernel vulnerable to
gr3 APP through CPU bugs.

This patch implement full KPTI functionality for gvisor. It doesn't
map the whole kernel in the upper. It maps only the text section
of the binary and the entry area required by the ISA. The entry area
contains the global idt, the percpu gdt/tss etc. The entry area
packs all these together which is less than 350k for 512 vCPUs.

The text section is normally nonsensitive. It is possible to
map only the entry functions (interrupt handler etc.) only.
But it requires some hacks.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-06 21:31:51 +08:00
Lai Jiangshan 6ce10c3c2f amd64: introduce kernelEntry
kernelEntry is split from CPU that contains minimal CPU-specific
arch state that can be mapped at the upper of the address space.

It is prepared for KPTI for gvisor.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-05 17:22:15 +08:00
Lai Jiangshan d17425082d amd64: don't check vcpu in bluepill()
m.Get() has guaranteed that if any OS thread TID is in guest,
m.vCPUs[TID] points to the vCPU in which the OS thread TID is running.

So if m.Get() returns with the corrent context in guest,
the vCPU of it must be the same as what Get() returns.

So bluepill() doesn't need to check if the vCPU is matched or not.
The check need to access to %gs register which will not points
to vCPU later when KPTI for gvisor is enabled. We can still
fetch the vCPU pointer from %gs later (when %gs points to kernelEntry),
but it needs the ENTRY_CPU_SELF which is generated by
ring0/offset_amd64.go. So we just simply remove the check.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-05 17:21:17 +08:00
Andrei Vagin 25798f214c Add callbacks to support lazy loading/restoring thread states
PiperOrigin-RevId: 324748508
2020-08-03 22:08:25 -07:00
gVisor bot 6a4bcbdb28 Merge pull request #3448 from lubinszARM:pr_tls_tests
PiperOrigin-RevId: 324127810
2020-07-30 18:44:17 -07:00
gVisor bot c9515dcca3 Merge pull request #3028 from lubinszARM:pr_kvm_hello1
PiperOrigin-RevId: 324125938
2020-07-30 18:29:32 -07:00