aboutsummaryrefslogtreecommitdiff
path: root/drivers/misc/cxl/api.c
AgeCommit message (Collapse)Author
2016-07-14cxl: Add support for interrupts on the Mellanox CX4Ian Munsie
The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where interrupts are routed from the networking hardware to the XSL using the MSIX table, and from there will be transformed back into an MSIX interrupt using the cxl style interrupts (i.e. using IVTE entries and ranges to map a PE and AFU interrupt number to an MSIX address). We want to hide the implementation details of cxl interrupts as much as possible. To this end, we use a special version of the MSI setup & teardown routines in the PHB while in cxl mode to allocate the cxl interrupts and configure the IVTE entries in the process element. This function does not configure the MSIX table - the CX4 card uses a custom format in that table and it would not be appropriate to fill that out in generic code. The rest of the functionality is similar to the "Full MSI-X mode" described in the CAIA, and this could be easily extended to support other adapters that use that mode in the future. The interrupts will be associated with the default context. If the maximum number of interrupts per context has been limited (e.g. by the mlx5 driver), it will automatically allocate additional kernel contexts to associate extra interrupts as required. These contexts will be started using the same WED that was used to start the default context. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14cxl: Add preliminary workaround for CX4 interrupt limitationIan Munsie
The Mellanox CX4 has a hardware limitation where only 4 bits of the AFU interrupt number can be passed to the XSL when sending an interrupt, limiting it to only 15 interrupts per context (AFU interrupt number 0 is invalid). In order to overcome this, we will allocate additional contexts linked to the default context as extra address space for the extra interrupts - this will be implemented in the next patch. This patch adds the preliminary support to allow this, by way of adding a linked list in the context structure that we use to keep track of the contexts dedicated to interrupts, and an API to simultaneously iterate over the related context structures, AFU interrupt numbers and hardware interrupt numbers. The point of using a single API to iterate these is to hide some of the details of the iteration from external code, and to reduce the number of APIs that need to be exported via base.c to allow built in code to call. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14cxl: Add kernel APIs to get & set the max irqs per contextIan Munsie
These APIs will be used by the Mellanox CX4 support. While they function standalone to configure existing behaviour, their primary purpose is to allow the Mellanox driver to inform the cxl driver of a hardware limitation, which will be used in a future patch. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14cxl: Add support for using the kernel API with a real PHBIan Munsie
This hooks up support for using the kernel API with a real PHB. After the AFU initialisation has completed it calls into the PHB code to pass it the AFU that will be used by other peer physical functions on the adapter. The cxl_pci_to_afu API is extended to work with peer PCI devices, retrieving the peer AFU from the PHB. This API may also now return an error if it is called on a PCI device that is not associated with either a cxl vPHB or a peer PCI device to an AFU, and this error is propagated down. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-28cxl: Add set and get private data to context structMichael Neuling
This provides AFU drivers a means to associate private data with a cxl context. This is particularly intended for make the new callbacks for driver specific events easier for AFU drivers to use, as they can easily get back to any private data structures they may use. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-28cxl: Add mechanism for delivering AFU driver specific eventsPhilippe Bergheaud
This adds an afu_driver_ops structure with fetch_event() and event_delivered() callbacks. An AFU driver such as cxlflash can fill this out and associate it with a context to enable passing custom AFU specific events to userspace. This also adds a new kernel API function cxl_context_pending_events(), that the AFU driver can use to notify the cxl driver that new specific events are ready to be delivered, and wake up anyone waiting on the context wait queue. The current count of AFU driver specific events is stored in the field afu_driver_events of the context structure. The cxl driver checks the afu_driver_events count during poll, select, read, etc. calls to check if an AFU driver specific event is pending, and calls fetch_event() to obtain and deliver that event. This way, the cxl driver takes care of all the usual locking semantics around these calls and handles all the generic cxl events, so that the AFU driver only needs to worry about it's own events. fetch_event() return a struct cxl_event_afu_driver_reserved, allocated by the AFU driver, and filled in with the specific event information and size. Total event size (header + data) should not be greater than CXL_READ_MIN_SIZE (4K). Th cxl driver prepends an appropriate cxl event header, copies the event to userspace, and finally calls event_delivered() to return the status of the operation to the AFU driver. The event is identified by the context and cxl_event_afu_driver_reserved pointers. Since AFU drivers provide their own means for userspace to obtain the AFU file descriptor (i.e. cxlflash uses an ioctl on their scsi file descriptor to obtain the AFU file descriptor) and the generic cxl driver will never use this event, the ABI of the event is up to each individual AFU driver. Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16cxl: Update process element after allocating interruptsIan Munsie
In the kernel API, it is possible to attempt to allocate AFU interrupts after already starting a context. Since the process element structure used by the hardware is only filled out at the time the context is started, it will not be updated with the interrupt numbers that have just been allocated and therefore AFU interrupts will not work unless they were allocated prior to starting the context. This can present some difficulties as each CAPI enabled PCI device in the kernel API has a default context, which may need to be started very early to enable translations, potentially before interrupts can easily be set up. This patch makes the API more flexible to allow interrupts to be allocated after a context has already been started and takes care of updating the PE structure used by the hardware and notifying it to discard any cached copy it may have. The update is currently performed via a terminate/remove/add sequence. This is necessary on some hardware such as the XSL that does not properly support the update LLCMD. Note that this is only supported on powernv at present - attempting to perform this ordering on PowerVM will raise a warning. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11cxl: Add kernel API to allow a context to operate with relocate disabledIan Munsie
cxl devices typically access memory using an MMU in much the same way as the CPU, and each context includes a state register much like the MSR in the CPU. Like the CPU, the state register includes a bit to enable relocation, which we currently always enable. In some cases, it may be desirable to allow a device to access memory using real addresses instead of effective addresses, so this adds a new API, cxl_set_translation_mode, that can be used to disable relocation on a given kernel context. This can allow for the creation of a special privileged context that the device can use if it needs relocation disabled, and can use regular contexts at times when it needs relocation enabled. This interface is only available to users of the kernel API for obvious reasons, and will never be supported in a virtualised environment. This will be used by the upcoming cxl support in the mlx5 driver. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11cxl: Remove dead codeFrederic Barrat
Function cxl_get_phys_dev() was removed from the kernel API by a previous patch, but it's actually dead code. Remove it. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Remove cxl_get_phys_dev() kernel APIFrederic Barrat
The cxl_get_phys_dev() API returns a struct device pointer which could belong to either a struct pci_dev (bare-metal) or platform_device (powerVM). To avoid potential problems in drivers, remove that API. It was introduced to allow drivers to read the VPD of the adapter, but the cxl driver now provides the cxl_pci_read_adapter_vpd() API for that purpose. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Support the cxl kernel API from a guestFrederic Barrat
Like on bare-metal, the cxl driver creates a virtual PHB and a pci device for the AFU. The configuration space of the device is mapped to the configuration record of the AFU. Reuse the code defined in afu_cr_read8|16|32() when reading the configuration space of the AFU device. Even though the (virtual) AFU device is a pci device, the adapter is not. So a driver using the cxl kernel API cannot read the VPD of the adapter through the usual PCI interface. Therefore, we add a call to the cxl kernel API: ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count); Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Add guest-specific codeChristophe Lombard
The new of.c file contains code to parse the device tree to find out about cxl adapters and AFUs. guest.c implements the guest-specific callbacks for the backend API. The process element ID is not known until the context is attached, so we have to separate the context ID assigned by the cxl driver from the process element ID visible to the user applications. In bare-metal, the 2 IDs match. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> [mpe: Fix SMP=n build, fix PSERIES=n build, minor whitespace fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Introduce implementation-specific APIFrederic Barrat
The backend API (in cxl.h) lists some low-level functions whose implementation is different on bare-metal and in a guest. Each environment implements its own functions, and the common code uses them through function pointers, defined in cxl_backend_ops Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09cxl: Define process problem state area at attach time onlyFrederic Barrat
CXL kernel API was defining the process problem state area during context initialization, making it possible to map the problem state area before attaching the context. This won't work on a powerVM guest. So force the logical behavior, like in userspace: attach first, then map the problem state area. Remove calls to cxl_assign_psn_space during init. The function is already called on the attach paths. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-05cxl: Fix DSI misses when the context owning task exitsVaibhav Jain
Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we store the pid of the current task_struct and use it to get pointer to the mm_struct of the process, while processing page or segment faults from the capi card. However this causes issues when the thread that had originally issued the start-work ioctl exits in which case the stored pid is no more valid and the cxl driver is unable to handle faults as the mm_struct corresponding to process is no more accessible. This patch fixes this issue by using the mm_struct of the next alive task in the thread group. This is done by iterating over all the tasks in the thread group starting from thread group leader and calling get_task_mm on each one of them. When a valid mm_struct is obtained the pid of the associated task is stored in the context replacing the exiting one for handling future faults. The patch introduces a new function named get_mem_context that checks if the current task pointed to by ctx->pid is dead? If yes it performs the steps described above. Also a new variable cxl_context.glpid is introduced which stores the pid of the thread group leader associated with the context owning task. Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reported-by: Frank Haverkamp <HAVERKAM@de.ibm.com> Suggested-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-24cxl: Fix possible idr warning when contexts are releasedVaibhav Jain
An idr warning is reported when a context is release after the capi card is unbound from the cxl driver via sysfs. Below are the steps to reproduce: 1. Create multiple afu contexts in an user-space application using libcxl. 2. Unbind capi card from cxl using command of form echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind 3. Exit/kill the application owning afu contexts. After above steps a warning message is usually seen in the kernel logs of the form "idr_remove called for id=<context-id> which is not allocated." This is caused by the function cxl_release_afu which destroys the contexts_idr table. So when a context is release no entry for context pe is found in the contexts_idr table and idr code prints this warning. This patch fixes this issue by increasing & decreasing the ref-count on the afu device when a context is initialized or when its freed respectively. This prevents the afu from being released until all the afu contexts have been released. The patch introduces two new functions namely cxl_afu_get/put that manage the ref-count on the afu device. Also the patch removes code inside cxl_dev_context_init that increases ref on the afu device as its guaranteed to be alive during this function. Reported-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-01cxl: fix leak of IRQ names in cxl_free_afu_irqs()Andrew Donnellan
cxl_free_afu_irqs() doesn't free IRQ names when it releases an AFU's IRQ ranges. The userspace API equivalent in afu_release_irqs() calls afu_irq_name_free() to release the IRQ names. Call afu_irq_name_free() in cxl_free_afu_irqs() to release the IRQ names. Make afu_irq_name_free() non-static to allow this. Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Fixes: 6f7f0b3df6d4 ("cxl: Add AFU virtual PHB and kernel API") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-30cxl: Fix force unmapping mmaps of contexts allocated through the kernel apiIan Munsie
The cxl user api uses the address_space associated with the file when we need to force unmap all cxl mmap regions (e.g. on eeh, driver detach, etc). Currently, contexts allocated through the kernel api do not do this and instead skip the mmap invalidation, potentially allowing them to poke at the hardware after such an event, which may cause all sorts of trouble. This patch allocates an address_space for cxl contexts allocated through the kernel api so that the same invalidate path will for these contexts as well. We don't use the anonymous inode's address_space, as doing so could invalidate any mmaps of completely unrelated drivers using anonymous file descriptors. This patch also introduces a kernelapi flag, so we know when freeing the context if the address_space was allocated by us and needs to be freed. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-30cxl: Fix + cleanup error paths in cxl_dev_context_initIan Munsie
If the cxl_context_alloc() call fails, we return immediately without releasing the reference on the AFU device, allowing it to leak. This patch switches to using goto style error handling so that the device is released in common code for both error paths, and will also simplify things if we add additional initialisation in this function in the future. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-20cxl: Allow release of contexts which have been OPENED but not STARTEDAndrew Donnellan
If we open a context but do not start it (either because we do not attempt to start it, or because it fails to start for some reason), we are left with a context in state OPENED. Previously, cxl_release_context() only allowed releasing contexts in state CLOSED, so attempting to release an OPENED context would fail. In particular, this bug causes available contexts to run out after some EEH failures, where drivers attempt to release contexts that have failed to start. Allow releasing contexts in any state with a value lower than STARTED, i.e. OPENED or CLOSED (we can't release a STARTED context as it's currently using the hardware, and we assume that contexts in any new states which may be added in future with a value higher than STARTED are also unsafe to release). Cc: stable@vger.kernel.org Fixes: 6f7f0b3df6d4 ("cxl: Add AFU virtual PHB and kernel API") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14cxl: Allow the kernel to trust that an image won't change on PERST.Daniel Axtens
Provide a kernel API and a sysfs entry which allow a user to specify that when a card is PERSTed, it's image will stay the same, allowing it to participate in EEH. cxl_reset is used to reflash the card. In that case, we cannot safely assert that the image will not change. Therefore, disallow cxl_reset if the flag is set. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07cxl: Fix refcounting in kernel APIMichael Neuling
Currently the kernel API AFU dev refcounting is done on context start and stop. This patch moves this refcounting to context init and release, bringing it inline with how the userspace API does it. Without this we've seen the refcounting on the AFU get out of whack between the user and kernel API usage. This causes the AFU structures to be freed when they are actually still in use. This fixes some kref warnings we've been seeing and spurious ErrIVTE IRQs. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03cxl: Add AFU virtual PHB and kernel APIMichael Neuling
This patch does two things. Firstly it presents the Accelerator Function Unit (AFUs) behind the POWER Service Layer (PSL) as PCI devices on a virtual PCI Host Bridge (vPHB). This in in addition to the PSL being a PCI device itself. As part of the Coherent Accelerator Interface Architecture (CAIA) AFUs can provide an AFU configuration. This AFU configuration recored is architected to be the same as a PCI config space. This patch sets discovers the AFU configuration records, provides AFU config space read/write functions to these configuration records. It then enumerates the PCI bus. It also hooks in PCI ops where appropriate. It also destroys the vPHB when the physical card is removed. Secondly, it add an in kernel API for AFU to use CXL. AFUs must present a driver that firstly binds as a PCI device. This PCI device can then be using to do CXL specific operations (that can't sit in the PCI ops) using this API. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>