You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

276 lines
8.1 KiB

/*
* Copyright (c) 2013-2018, ARM Limited and Contributors. All rights reserved.
*
* SPDX-License-Identifier: BSD-3-Clause
*/
#include <platform_def.h>
#include <xlat_tables_defs.h>
OUTPUT_FORMAT(PLATFORM_LINKER_FORMAT)
OUTPUT_ARCH(PLATFORM_LINKER_ARCH)
ENTRY(bl31_entrypoint)
MEMORY {
RAM (rwx): ORIGIN = BL31_BASE, LENGTH = BL31_LIMIT - BL31_BASE
}
#ifdef PLAT_EXTRA_LD_SCRIPT
#include <plat.ld.S>
#endif
SECTIONS
{
. = BL31_BASE;
ASSERT(. == ALIGN(PAGE_SIZE),
"BL31_BASE address is not aligned on a page boundary.")
Introduce SEPARATE_CODE_AND_RODATA build flag At the moment, all BL images share a similar memory layout: they start with their code section, followed by their read-only data section. The two sections are contiguous in memory. Therefore, the end of the code section and the beginning of the read-only data one might share a memory page. This forces both to be mapped with the same memory attributes. As the code needs to be executable, this means that the read-only data stored on the same memory page as the code are executable as well. This could potentially be exploited as part of a security attack. This patch introduces a new build flag called SEPARATE_CODE_AND_RODATA, which isolates the code and read-only data on separate memory pages. This in turn allows independent control of the access permissions for the code and read-only data. This has an impact on memory footprint, as padding bytes need to be introduced between the code and read-only data to ensure the segragation of the two. To limit the memory cost, the memory layout of the read-only section has been changed in this case. - When SEPARATE_CODE_AND_RODATA=0, the layout is unchanged, i.e. the read-only section still looks like this (padding omitted): | ... | +-------------------+ | Exception vectors | +-------------------+ | Read-only data | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script provides the limits of the whole read-only section. - When SEPARATE_CODE_AND_RODATA=1, the exception vectors and read-only data are swapped, such that the code and exception vectors are contiguous, followed by the read-only data. This gives the following new layout (padding omitted): | ... | +-------------------+ | Read-only data | +-------------------+ | Exception vectors | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script now exports 2 sets of addresses instead: the limits of the code and the limits of the read-only data. Refer to the Firmware Design guide for more details. This provides platform code with a finer-grained view of the image layout and allows it to map these 2 regions with the appropriate access permissions. Note that SEPARATE_CODE_AND_RODATA applies to all BL images. Change-Id: I936cf80164f6b66b6ad52b8edacadc532c935a49
8 years ago
#if SEPARATE_CODE_AND_RODATA
.text . : {
__TEXT_START__ = .;
*bl31_entrypoint.o(.text*)
*(.text*)
*(.vectors)
. = ALIGN(PAGE_SIZE);
Introduce SEPARATE_CODE_AND_RODATA build flag At the moment, all BL images share a similar memory layout: they start with their code section, followed by their read-only data section. The two sections are contiguous in memory. Therefore, the end of the code section and the beginning of the read-only data one might share a memory page. This forces both to be mapped with the same memory attributes. As the code needs to be executable, this means that the read-only data stored on the same memory page as the code are executable as well. This could potentially be exploited as part of a security attack. This patch introduces a new build flag called SEPARATE_CODE_AND_RODATA, which isolates the code and read-only data on separate memory pages. This in turn allows independent control of the access permissions for the code and read-only data. This has an impact on memory footprint, as padding bytes need to be introduced between the code and read-only data to ensure the segragation of the two. To limit the memory cost, the memory layout of the read-only section has been changed in this case. - When SEPARATE_CODE_AND_RODATA=0, the layout is unchanged, i.e. the read-only section still looks like this (padding omitted): | ... | +-------------------+ | Exception vectors | +-------------------+ | Read-only data | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script provides the limits of the whole read-only section. - When SEPARATE_CODE_AND_RODATA=1, the exception vectors and read-only data are swapped, such that the code and exception vectors are contiguous, followed by the read-only data. This gives the following new layout (padding omitted): | ... | +-------------------+ | Read-only data | +-------------------+ | Exception vectors | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script now exports 2 sets of addresses instead: the limits of the code and the limits of the read-only data. Refer to the Firmware Design guide for more details. This provides platform code with a finer-grained view of the image layout and allows it to map these 2 regions with the appropriate access permissions. Note that SEPARATE_CODE_AND_RODATA applies to all BL images. Change-Id: I936cf80164f6b66b6ad52b8edacadc532c935a49
8 years ago
__TEXT_END__ = .;
} >RAM
.rodata . : {
__RODATA_START__ = .;
*(.rodata*)
/* Ensure 8-byte alignment for descriptors and ensure inclusion */
. = ALIGN(8);
__RT_SVC_DESCS_START__ = .;
KEEP(*(rt_svc_descs))
__RT_SVC_DESCS_END__ = .;
#if ENABLE_PMF
/* Ensure 8-byte alignment for descriptors and ensure inclusion */
. = ALIGN(8);
__PMF_SVC_DESCS_START__ = .;
KEEP(*(pmf_svc_descs))
__PMF_SVC_DESCS_END__ = .;
#endif /* ENABLE_PMF */
/*
* Ensure 8-byte alignment for cpu_ops so that its fields are also
* aligned. Also ensure cpu_ops inclusion.
*/
. = ALIGN(8);
__CPU_OPS_START__ = .;
KEEP(*(cpu_ops))
__CPU_OPS_END__ = .;
/* Place pubsub sections for events */
. = ALIGN(8);
#include <pubsub_events.h>
. = ALIGN(PAGE_SIZE);
Introduce SEPARATE_CODE_AND_RODATA build flag At the moment, all BL images share a similar memory layout: they start with their code section, followed by their read-only data section. The two sections are contiguous in memory. Therefore, the end of the code section and the beginning of the read-only data one might share a memory page. This forces both to be mapped with the same memory attributes. As the code needs to be executable, this means that the read-only data stored on the same memory page as the code are executable as well. This could potentially be exploited as part of a security attack. This patch introduces a new build flag called SEPARATE_CODE_AND_RODATA, which isolates the code and read-only data on separate memory pages. This in turn allows independent control of the access permissions for the code and read-only data. This has an impact on memory footprint, as padding bytes need to be introduced between the code and read-only data to ensure the segragation of the two. To limit the memory cost, the memory layout of the read-only section has been changed in this case. - When SEPARATE_CODE_AND_RODATA=0, the layout is unchanged, i.e. the read-only section still looks like this (padding omitted): | ... | +-------------------+ | Exception vectors | +-------------------+ | Read-only data | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script provides the limits of the whole read-only section. - When SEPARATE_CODE_AND_RODATA=1, the exception vectors and read-only data are swapped, such that the code and exception vectors are contiguous, followed by the read-only data. This gives the following new layout (padding omitted): | ... | +-------------------+ | Read-only data | +-------------------+ | Exception vectors | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script now exports 2 sets of addresses instead: the limits of the code and the limits of the read-only data. Refer to the Firmware Design guide for more details. This provides platform code with a finer-grained view of the image layout and allows it to map these 2 regions with the appropriate access permissions. Note that SEPARATE_CODE_AND_RODATA applies to all BL images. Change-Id: I936cf80164f6b66b6ad52b8edacadc532c935a49
8 years ago
__RODATA_END__ = .;
} >RAM
#else
ro . : {
__RO_START__ = .;
*bl31_entrypoint.o(.text*)
*(.text*)
*(.rodata*)
/* Ensure 8-byte alignment for descriptors and ensure inclusion */
. = ALIGN(8);
__RT_SVC_DESCS_START__ = .;
KEEP(*(rt_svc_descs))
__RT_SVC_DESCS_END__ = .;
#if ENABLE_PMF
/* Ensure 8-byte alignment for descriptors and ensure inclusion */
. = ALIGN(8);
__PMF_SVC_DESCS_START__ = .;
KEEP(*(pmf_svc_descs))
__PMF_SVC_DESCS_END__ = .;
#endif /* ENABLE_PMF */
/*
* Ensure 8-byte alignment for cpu_ops so that its fields are also
* aligned. Also ensure cpu_ops inclusion.
*/
. = ALIGN(8);
__CPU_OPS_START__ = .;
KEEP(*(cpu_ops))
__CPU_OPS_END__ = .;
/* Place pubsub sections for events */
. = ALIGN(8);
#include <pubsub_events.h>
*(.vectors)
__RO_END_UNALIGNED__ = .;
/*
* Memory page(s) mapped to this section will be marked as read-only,
* executable. No RW data from the next section must creep in.
* Ensure the rest of the current memory page is unused.
*/
. = ALIGN(PAGE_SIZE);
__RO_END__ = .;
} >RAM
Introduce SEPARATE_CODE_AND_RODATA build flag At the moment, all BL images share a similar memory layout: they start with their code section, followed by their read-only data section. The two sections are contiguous in memory. Therefore, the end of the code section and the beginning of the read-only data one might share a memory page. This forces both to be mapped with the same memory attributes. As the code needs to be executable, this means that the read-only data stored on the same memory page as the code are executable as well. This could potentially be exploited as part of a security attack. This patch introduces a new build flag called SEPARATE_CODE_AND_RODATA, which isolates the code and read-only data on separate memory pages. This in turn allows independent control of the access permissions for the code and read-only data. This has an impact on memory footprint, as padding bytes need to be introduced between the code and read-only data to ensure the segragation of the two. To limit the memory cost, the memory layout of the read-only section has been changed in this case. - When SEPARATE_CODE_AND_RODATA=0, the layout is unchanged, i.e. the read-only section still looks like this (padding omitted): | ... | +-------------------+ | Exception vectors | +-------------------+ | Read-only data | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script provides the limits of the whole read-only section. - When SEPARATE_CODE_AND_RODATA=1, the exception vectors and read-only data are swapped, such that the code and exception vectors are contiguous, followed by the read-only data. This gives the following new layout (padding omitted): | ... | +-------------------+ | Read-only data | +-------------------+ | Exception vectors | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script now exports 2 sets of addresses instead: the limits of the code and the limits of the read-only data. Refer to the Firmware Design guide for more details. This provides platform code with a finer-grained view of the image layout and allows it to map these 2 regions with the appropriate access permissions. Note that SEPARATE_CODE_AND_RODATA applies to all BL images. Change-Id: I936cf80164f6b66b6ad52b8edacadc532c935a49
8 years ago
#endif
ASSERT(__CPU_OPS_END__ > __CPU_OPS_START__,
"cpu_ops not defined for this platform.")
SPM: Introduce Secure Partition Manager A Secure Partition is a software execution environment instantiated in S-EL0 that can be used to implement simple management and security services. Since S-EL0 is an unprivileged exception level, a Secure Partition relies on privileged firmware e.g. ARM Trusted Firmware to be granted access to system and processor resources. Essentially, it is a software sandbox that runs under the control of privileged software in the Secure World and accesses the following system resources: - Memory and device regions in the system address map. - PE system registers. - A range of asynchronous exceptions e.g. interrupts. - A range of synchronous exceptions e.g. SMC function identifiers. A Secure Partition enables privileged firmware to implement only the absolutely essential secure services in EL3 and instantiate the rest in a partition. Since the partition executes in S-EL0, its implementation cannot be overly complex. The component in ARM Trusted Firmware responsible for managing a Secure Partition is called the Secure Partition Manager (SPM). The SPM is responsible for the following: - Validating and allocating resources requested by a Secure Partition. - Implementing a well defined interface that is used for initialising a Secure Partition. - Implementing a well defined interface that is used by the normal world and other secure services for accessing the services exported by a Secure Partition. - Implementing a well defined interface that is used by a Secure Partition to fulfil service requests. - Instantiating the software execution environment required by a Secure Partition to fulfil a service request. Change-Id: I6f7862d6bba8732db5b73f54e789d717a35e802f Co-authored-by: Douglas Raillard &lt;douglas.raillard@arm.com&gt; Co-authored-by: Sandrine Bailleux &lt;sandrine.bailleux@arm.com&gt; Co-authored-by: Achin Gupta &lt;achin.gupta@arm.com&gt; Co-authored-by: Antonio Nino Diaz &lt;antonio.ninodiaz@arm.com&gt; Signed-off-by: Antonio Nino Diaz &lt;antonio.ninodiaz@arm.com&gt;
7 years ago
#if ENABLE_SPM
/*
* Exception vectors of the SPM shim layer. They must be aligned to a 2K
* address, but we need to place them in a separate page so that we can set
* individual permissions to them, so the actual alignment needed is 4K.
*
* There's no need to include this into the RO section of BL31 because it
* doesn't need to be accessed by BL31.
*/
spm_shim_exceptions : ALIGN(PAGE_SIZE) {
SPM: Introduce Secure Partition Manager A Secure Partition is a software execution environment instantiated in S-EL0 that can be used to implement simple management and security services. Since S-EL0 is an unprivileged exception level, a Secure Partition relies on privileged firmware e.g. ARM Trusted Firmware to be granted access to system and processor resources. Essentially, it is a software sandbox that runs under the control of privileged software in the Secure World and accesses the following system resources: - Memory and device regions in the system address map. - PE system registers. - A range of asynchronous exceptions e.g. interrupts. - A range of synchronous exceptions e.g. SMC function identifiers. A Secure Partition enables privileged firmware to implement only the absolutely essential secure services in EL3 and instantiate the rest in a partition. Since the partition executes in S-EL0, its implementation cannot be overly complex. The component in ARM Trusted Firmware responsible for managing a Secure Partition is called the Secure Partition Manager (SPM). The SPM is responsible for the following: - Validating and allocating resources requested by a Secure Partition. - Implementing a well defined interface that is used for initialising a Secure Partition. - Implementing a well defined interface that is used by the normal world and other secure services for accessing the services exported by a Secure Partition. - Implementing a well defined interface that is used by a Secure Partition to fulfil service requests. - Instantiating the software execution environment required by a Secure Partition to fulfil a service request. Change-Id: I6f7862d6bba8732db5b73f54e789d717a35e802f Co-authored-by: Douglas Raillard &lt;douglas.raillard@arm.com&gt; Co-authored-by: Sandrine Bailleux &lt;sandrine.bailleux@arm.com&gt; Co-authored-by: Achin Gupta &lt;achin.gupta@arm.com&gt; Co-authored-by: Antonio Nino Diaz &lt;antonio.ninodiaz@arm.com&gt; Signed-off-by: Antonio Nino Diaz &lt;antonio.ninodiaz@arm.com&gt;
7 years ago
__SPM_SHIM_EXCEPTIONS_START__ = .;
*(.spm_shim_exceptions)
. = ALIGN(PAGE_SIZE);
SPM: Introduce Secure Partition Manager A Secure Partition is a software execution environment instantiated in S-EL0 that can be used to implement simple management and security services. Since S-EL0 is an unprivileged exception level, a Secure Partition relies on privileged firmware e.g. ARM Trusted Firmware to be granted access to system and processor resources. Essentially, it is a software sandbox that runs under the control of privileged software in the Secure World and accesses the following system resources: - Memory and device regions in the system address map. - PE system registers. - A range of asynchronous exceptions e.g. interrupts. - A range of synchronous exceptions e.g. SMC function identifiers. A Secure Partition enables privileged firmware to implement only the absolutely essential secure services in EL3 and instantiate the rest in a partition. Since the partition executes in S-EL0, its implementation cannot be overly complex. The component in ARM Trusted Firmware responsible for managing a Secure Partition is called the Secure Partition Manager (SPM). The SPM is responsible for the following: - Validating and allocating resources requested by a Secure Partition. - Implementing a well defined interface that is used for initialising a Secure Partition. - Implementing a well defined interface that is used by the normal world and other secure services for accessing the services exported by a Secure Partition. - Implementing a well defined interface that is used by a Secure Partition to fulfil service requests. - Instantiating the software execution environment required by a Secure Partition to fulfil a service request. Change-Id: I6f7862d6bba8732db5b73f54e789d717a35e802f Co-authored-by: Douglas Raillard &lt;douglas.raillard@arm.com&gt; Co-authored-by: Sandrine Bailleux &lt;sandrine.bailleux@arm.com&gt; Co-authored-by: Achin Gupta &lt;achin.gupta@arm.com&gt; Co-authored-by: Antonio Nino Diaz &lt;antonio.ninodiaz@arm.com&gt; Signed-off-by: Antonio Nino Diaz &lt;antonio.ninodiaz@arm.com&gt;
7 years ago
__SPM_SHIM_EXCEPTIONS_END__ = .;
} >RAM
#endif
Make generic code work in presence of system caches On the ARMv8 architecture, cache maintenance operations by set/way on the last level of integrated cache do not affect the system cache. This means that such a flush or clean operation could result in the data being pushed out to the system cache rather than main memory. Another CPU could access this data before it enables its data cache or MMU. Such accesses could be serviced from the main memory instead of the system cache. If the data in the sysem cache has not yet been flushed or evicted to main memory then there could be a loss of coherency. The only mechanism to guarantee that the main memory will be updated is to use cache maintenance operations to the PoC by MVA(See section D3.4.11 (System level caches) of ARMv8-A Reference Manual (Issue A.g/ARM DDI0487A.G). This patch removes the reliance of Trusted Firmware on the flush by set/way operation to ensure visibility of data in the main memory. Cache maintenance operations by MVA are now used instead. The following are the broad category of changes: 1. The RW areas of BL2/BL31/BL32 are invalidated by MVA before the C runtime is initialised. This ensures that any stale cache lines at any level of cache are removed. 2. Updates to global data in runtime firmware (BL31) by the primary CPU are made visible to secondary CPUs using a cache clean operation by MVA. 3. Cache maintenance by set/way operations are only used prior to power down. NOTE: NON-UPSTREAM TRUSTED FIRMWARE CODE SHOULD MAKE EQUIVALENT CHANGES IN ORDER TO FUNCTION CORRECTLY ON PLATFORMS WITH SUPPORT FOR SYSTEM CACHES. Fixes ARM-software/tf-issues#205 Change-Id: I64f1b398de0432813a0e0881d70f8337681f6e9a
9 years ago
/*
* Define a linker symbol to mark start of the RW memory area for this
* image.
*/
__RW_START__ = . ;
/*
* .data must be placed at a lower address than the stacks if the stack
* protector is enabled. Alternatively, the .data.stack_protector_canary
* section can be placed independently of the main .data section.
*/
.data . : {
__DATA_START__ = .;
*(.data*)
__DATA_END__ = .;
} >RAM
fvp: Reuse BL1 and BL2 memory through image overlaying This patch re-organizes the memory layout on FVP as to give the BL3-2 image as much memory as possible. Considering these two facts: - not all images need to live in memory at the same time. Once in BL3-1, the memory used by BL1 and BL2 can be reclaimed. - when BL2 loads the BL3-1 and BL3-2 images, it only considers the PROGBITS sections of those 2 images. The memory occupied by the NOBITS sections will be touched only at execution of the BL3-x images; Then it is possible to choose the different base addresses such that the NOBITS sections of BL3-1 and BL3-2 overlay BL1 and BL2. On FVP we choose to put: - BL1 and BL3-1 at the top of the Trusted RAM, with BL3-1 NOBITS sections overlaying BL1; - BL3-2 at the bottom of the Trusted RAM, with its NOBITS sections overlaying BL2; This is illustrated by the following diagram: 0x0404_0000 ------------ ------------------ | BL1 | &lt;= | BL3-1 NOBITS | ------------ &lt;= ------------------ | | &lt;= | BL3-1 PROGBITS | ------------ ------------------ | BL2 | &lt;= | BL3-2 NOBITS | ------------ &lt;= ------------------ | | &lt;= | BL3-2 PROGBITS | 0x0400_0000 ------------ ------------------ New platform-specific constants have been introduced to easily check at link time that BL3-1 and BL3-2 PROGBITS sections don&#39;t overwrite BL1 and BL2. These are optional and the platform code is free to define them or not. If not defined, the linker won&#39;t attempt to check image overlaying. Fixes ARM-software/tf-issues#117 Change-Id: I5981d1c3d66ee70eaac8bd052630c9ac6dd8b042
11 years ago
#ifdef BL31_PROGBITS_LIMIT
ASSERT(. <= BL31_PROGBITS_LIMIT, "BL31 progbits has exceeded its limit.")
fvp: Reuse BL1 and BL2 memory through image overlaying This patch re-organizes the memory layout on FVP as to give the BL3-2 image as much memory as possible. Considering these two facts: - not all images need to live in memory at the same time. Once in BL3-1, the memory used by BL1 and BL2 can be reclaimed. - when BL2 loads the BL3-1 and BL3-2 images, it only considers the PROGBITS sections of those 2 images. The memory occupied by the NOBITS sections will be touched only at execution of the BL3-x images; Then it is possible to choose the different base addresses such that the NOBITS sections of BL3-1 and BL3-2 overlay BL1 and BL2. On FVP we choose to put: - BL1 and BL3-1 at the top of the Trusted RAM, with BL3-1 NOBITS sections overlaying BL1; - BL3-2 at the bottom of the Trusted RAM, with its NOBITS sections overlaying BL2; This is illustrated by the following diagram: 0x0404_0000 ------------ ------------------ | BL1 | &lt;= | BL3-1 NOBITS | ------------ &lt;= ------------------ | | &lt;= | BL3-1 PROGBITS | ------------ ------------------ | BL2 | &lt;= | BL3-2 NOBITS | ------------ &lt;= ------------------ | | &lt;= | BL3-2 PROGBITS | 0x0400_0000 ------------ ------------------ New platform-specific constants have been introduced to easily check at link time that BL3-1 and BL3-2 PROGBITS sections don&#39;t overwrite BL1 and BL2. These are optional and the platform code is free to define them or not. If not defined, the linker won&#39;t attempt to check image overlaying. Fixes ARM-software/tf-issues#117 Change-Id: I5981d1c3d66ee70eaac8bd052630c9ac6dd8b042
11 years ago
#endif
stacks (NOLOAD) : {
__STACKS_START__ = .;
*(tzfw_normal_stacks)
__STACKS_END__ = .;
} >RAM
/*
* The .bss section gets initialised to 0 at runtime.
Introduce unified API to zero memory Introduce zeromem_dczva function on AArch64 that can handle unaligned addresses and make use of DC ZVA instruction to zero a whole block at a time. This zeroing takes place directly in the cache to speed it up without doing external memory access. Remove the zeromem16 function on AArch64 and replace it with an alias to zeromem. This zeromem16 function is now deprecated. Remove the 16-bytes alignment constraint on __BSS_START__ in firmware-design.md as it is now not mandatory anymore (it used to comply with zeromem16 requirements). Change the 16-bytes alignment constraints in SP min&#39;s linker script to a 8-bytes alignment constraint as the AArch32 zeromem implementation is now more efficient on 8-bytes aligned addresses. Introduce zero_normalmem and zeromem helpers in platform agnostic header that are implemented this way: * AArch32: * zero_normalmem: zero using usual data access * zeromem: alias for zero_normalmem * AArch64: * zero_normalmem: zero normal memory using DC ZVA instruction (needs MMU enabled) * zeromem: zero using usual data access Usage guidelines: in most cases, zero_normalmem should be preferred. There are 2 scenarios where zeromem (or memset) must be used instead: * Code that must run with MMU disabled (which means all memory is considered device memory for data accesses). * Code that fills device memory with null bytes. Optionally, the following rule can be applied if performance is important: * Code zeroing small areas (few bytes) that are not secrets should use memset to take advantage of compiler optimizations. Note: Code zeroing security-related critical information should use zero_normalmem/zeromem instead of memset to avoid removal by compilers&#39; optimizations in some cases or misbehaving versions of GCC. Fixes ARM-software/tf-issues#408 Change-Id: Iafd9663fc1070413c3e1904e54091cf60effaa82 Signed-off-by: Douglas Raillard &lt;douglas.raillard@arm.com&gt;
8 years ago
* Its base address should be 16-byte aligned for better performance of the
* zero-initialization code.
*/
Re-design bakery lock memory allocation and algorithm This patch unifies the bakery lock api&#39;s across coherent and normal memory implementation of locks by using same data type `bakery_lock_t` and similar arguments to functions. A separate section `bakery_lock` has been created and used to allocate memory for bakery locks using `DEFINE_BAKERY_LOCK`. When locks are allocated in normal memory, each lock for a core has to spread across multiple cache lines. By using the total size allocated in a separate cache line for a single core at compile time, the memory for other core locks is allocated at link time by multiplying the single core locks size with (PLATFORM_CORE_COUNT - 1). The normal memory lock algorithm now uses lock address instead of the `id` in the per_cpu_data. For locks allocated in coherent memory, it moves locks from tzfw_coherent_memory to bakery_lock section. The bakery locks are allocated as part of bss or in coherent memory depending on usage of coherent memory. Both these regions are initialised to zero as part of run_time_init before locks are used. Hence, bakery_lock_init() is made an empty function as the lock memory is already initialised to zero. The above design lead to the removal of psci bakery locks from non_cpu_power_pd_node to psci_locks. NOTE: THE BAKERY LOCK API WHEN USE_COHERENT_MEM IS NOT SET HAS CHANGED. THIS IS A BREAKING CHANGE FOR ALL PLATFORM PORTS THAT ALLOCATE BAKERY LOCKS IN NORMAL MEMORY. Change-Id: Ic3751c0066b8032dcbf9d88f1d4dc73d15f61d8b
9 years ago
.bss (NOLOAD) : ALIGN(16) {
__BSS_START__ = .;
*(.bss*)
*(COMMON)
Re-design bakery lock memory allocation and algorithm This patch unifies the bakery lock api&#39;s across coherent and normal memory implementation of locks by using same data type `bakery_lock_t` and similar arguments to functions. A separate section `bakery_lock` has been created and used to allocate memory for bakery locks using `DEFINE_BAKERY_LOCK`. When locks are allocated in normal memory, each lock for a core has to spread across multiple cache lines. By using the total size allocated in a separate cache line for a single core at compile time, the memory for other core locks is allocated at link time by multiplying the single core locks size with (PLATFORM_CORE_COUNT - 1). The normal memory lock algorithm now uses lock address instead of the `id` in the per_cpu_data. For locks allocated in coherent memory, it moves locks from tzfw_coherent_memory to bakery_lock section. The bakery locks are allocated as part of bss or in coherent memory depending on usage of coherent memory. Both these regions are initialised to zero as part of run_time_init before locks are used. Hence, bakery_lock_init() is made an empty function as the lock memory is already initialised to zero. The above design lead to the removal of psci bakery locks from non_cpu_power_pd_node to psci_locks. NOTE: THE BAKERY LOCK API WHEN USE_COHERENT_MEM IS NOT SET HAS CHANGED. THIS IS A BREAKING CHANGE FOR ALL PLATFORM PORTS THAT ALLOCATE BAKERY LOCKS IN NORMAL MEMORY. Change-Id: Ic3751c0066b8032dcbf9d88f1d4dc73d15f61d8b
9 years ago
#if !USE_COHERENT_MEM
/*
* Bakery locks are stored in normal .bss memory
*
* Each lock's data is spread across multiple cache lines, one per CPU,
* but multiple locks can share the same cache line.
* The compiler will allocate enough memory for one CPU's bakery locks,
* the remaining cache lines are allocated by the linker script
*/
. = ALIGN(CACHE_WRITEBACK_GRANULE);
__BAKERY_LOCK_START__ = .;
*(bakery_lock)
. = ALIGN(CACHE_WRITEBACK_GRANULE);
__PERCPU_BAKERY_LOCK_SIZE__ = ABSOLUTE(. - __BAKERY_LOCK_START__);
Re-design bakery lock memory allocation and algorithm This patch unifies the bakery lock api&#39;s across coherent and normal memory implementation of locks by using same data type `bakery_lock_t` and similar arguments to functions. A separate section `bakery_lock` has been created and used to allocate memory for bakery locks using `DEFINE_BAKERY_LOCK`. When locks are allocated in normal memory, each lock for a core has to spread across multiple cache lines. By using the total size allocated in a separate cache line for a single core at compile time, the memory for other core locks is allocated at link time by multiplying the single core locks size with (PLATFORM_CORE_COUNT - 1). The normal memory lock algorithm now uses lock address instead of the `id` in the per_cpu_data. For locks allocated in coherent memory, it moves locks from tzfw_coherent_memory to bakery_lock section. The bakery locks are allocated as part of bss or in coherent memory depending on usage of coherent memory. Both these regions are initialised to zero as part of run_time_init before locks are used. Hence, bakery_lock_init() is made an empty function as the lock memory is already initialised to zero. The above design lead to the removal of psci bakery locks from non_cpu_power_pd_node to psci_locks. NOTE: THE BAKERY LOCK API WHEN USE_COHERENT_MEM IS NOT SET HAS CHANGED. THIS IS A BREAKING CHANGE FOR ALL PLATFORM PORTS THAT ALLOCATE BAKERY LOCKS IN NORMAL MEMORY. Change-Id: Ic3751c0066b8032dcbf9d88f1d4dc73d15f61d8b
9 years ago
. = . + (__PERCPU_BAKERY_LOCK_SIZE__ * (PLATFORM_CORE_COUNT - 1));
__BAKERY_LOCK_END__ = .;
/*
* If BL31 doesn't use any bakery lock then __PERCPU_BAKERY_LOCK_SIZE__
* will be zero. For this reason, the only two valid values for
* __PERCPU_BAKERY_LOCK_SIZE__ are 0 or the platform defined value
* PLAT_PERCPU_BAKERY_LOCK_SIZE.
*/
Re-design bakery lock memory allocation and algorithm This patch unifies the bakery lock api&#39;s across coherent and normal memory implementation of locks by using same data type `bakery_lock_t` and similar arguments to functions. A separate section `bakery_lock` has been created and used to allocate memory for bakery locks using `DEFINE_BAKERY_LOCK`. When locks are allocated in normal memory, each lock for a core has to spread across multiple cache lines. By using the total size allocated in a separate cache line for a single core at compile time, the memory for other core locks is allocated at link time by multiplying the single core locks size with (PLATFORM_CORE_COUNT - 1). The normal memory lock algorithm now uses lock address instead of the `id` in the per_cpu_data. For locks allocated in coherent memory, it moves locks from tzfw_coherent_memory to bakery_lock section. The bakery locks are allocated as part of bss or in coherent memory depending on usage of coherent memory. Both these regions are initialised to zero as part of run_time_init before locks are used. Hence, bakery_lock_init() is made an empty function as the lock memory is already initialised to zero. The above design lead to the removal of psci bakery locks from non_cpu_power_pd_node to psci_locks. NOTE: THE BAKERY LOCK API WHEN USE_COHERENT_MEM IS NOT SET HAS CHANGED. THIS IS A BREAKING CHANGE FOR ALL PLATFORM PORTS THAT ALLOCATE BAKERY LOCKS IN NORMAL MEMORY. Change-Id: Ic3751c0066b8032dcbf9d88f1d4dc73d15f61d8b
9 years ago
#ifdef PLAT_PERCPU_BAKERY_LOCK_SIZE
ASSERT((__PERCPU_BAKERY_LOCK_SIZE__ == 0) || (__PERCPU_BAKERY_LOCK_SIZE__ == PLAT_PERCPU_BAKERY_LOCK_SIZE),
Re-design bakery lock memory allocation and algorithm This patch unifies the bakery lock api&#39;s across coherent and normal memory implementation of locks by using same data type `bakery_lock_t` and similar arguments to functions. A separate section `bakery_lock` has been created and used to allocate memory for bakery locks using `DEFINE_BAKERY_LOCK`. When locks are allocated in normal memory, each lock for a core has to spread across multiple cache lines. By using the total size allocated in a separate cache line for a single core at compile time, the memory for other core locks is allocated at link time by multiplying the single core locks size with (PLATFORM_CORE_COUNT - 1). The normal memory lock algorithm now uses lock address instead of the `id` in the per_cpu_data. For locks allocated in coherent memory, it moves locks from tzfw_coherent_memory to bakery_lock section. The bakery locks are allocated as part of bss or in coherent memory depending on usage of coherent memory. Both these regions are initialised to zero as part of run_time_init before locks are used. Hence, bakery_lock_init() is made an empty function as the lock memory is already initialised to zero. The above design lead to the removal of psci bakery locks from non_cpu_power_pd_node to psci_locks. NOTE: THE BAKERY LOCK API WHEN USE_COHERENT_MEM IS NOT SET HAS CHANGED. THIS IS A BREAKING CHANGE FOR ALL PLATFORM PORTS THAT ALLOCATE BAKERY LOCKS IN NORMAL MEMORY. Change-Id: Ic3751c0066b8032dcbf9d88f1d4dc73d15f61d8b
9 years ago
"PLAT_PERCPU_BAKERY_LOCK_SIZE does not match bakery lock requirements");
#endif
#endif
#if ENABLE_PMF
/*
* Time-stamps are stored in normal .bss memory
*
* The compiler will allocate enough memory for one CPU's time-stamps,
* the remaining memory for other CPU's is allocated by the
* linker script
*/
. = ALIGN(CACHE_WRITEBACK_GRANULE);
__PMF_TIMESTAMP_START__ = .;
KEEP(*(pmf_timestamp_array))
. = ALIGN(CACHE_WRITEBACK_GRANULE);
__PMF_PERCPU_TIMESTAMP_END__ = .;
__PERCPU_TIMESTAMP_SIZE__ = ABSOLUTE(. - __PMF_TIMESTAMP_START__);
. = . + (__PERCPU_TIMESTAMP_SIZE__ * (PLATFORM_CORE_COUNT - 1));
__PMF_TIMESTAMP_END__ = .;
#endif /* ENABLE_PMF */
__BSS_END__ = .;
} >RAM
/*
* The xlat_table section is for full, aligned page tables (4K).
* Removing them from .bss avoids forcing 4K alignment on
* the .bss section. The tables are initialized to zero by the translation
* tables library.
*/
xlat_table (NOLOAD) : {
*(xlat_table)
} >RAM
#if USE_COHERENT_MEM
/*
* The base address of the coherent memory section must be page-aligned (4K)
* to guarantee that the coherent data are stored on their own pages and
* are not mixed with normal data. This is required to set up the correct
* memory attributes for the coherent data page tables.
*/
coherent_ram (NOLOAD) : ALIGN(PAGE_SIZE) {
__COHERENT_RAM_START__ = .;
Re-design bakery lock memory allocation and algorithm This patch unifies the bakery lock api&#39;s across coherent and normal memory implementation of locks by using same data type `bakery_lock_t` and similar arguments to functions. A separate section `bakery_lock` has been created and used to allocate memory for bakery locks using `DEFINE_BAKERY_LOCK`. When locks are allocated in normal memory, each lock for a core has to spread across multiple cache lines. By using the total size allocated in a separate cache line for a single core at compile time, the memory for other core locks is allocated at link time by multiplying the single core locks size with (PLATFORM_CORE_COUNT - 1). The normal memory lock algorithm now uses lock address instead of the `id` in the per_cpu_data. For locks allocated in coherent memory, it moves locks from tzfw_coherent_memory to bakery_lock section. The bakery locks are allocated as part of bss or in coherent memory depending on usage of coherent memory. Both these regions are initialised to zero as part of run_time_init before locks are used. Hence, bakery_lock_init() is made an empty function as the lock memory is already initialised to zero. The above design lead to the removal of psci bakery locks from non_cpu_power_pd_node to psci_locks. NOTE: THE BAKERY LOCK API WHEN USE_COHERENT_MEM IS NOT SET HAS CHANGED. THIS IS A BREAKING CHANGE FOR ALL PLATFORM PORTS THAT ALLOCATE BAKERY LOCKS IN NORMAL MEMORY. Change-Id: Ic3751c0066b8032dcbf9d88f1d4dc73d15f61d8b
9 years ago
/*
* Bakery locks are stored in coherent memory
*
* Each lock's data is contiguous and fully allocated by the compiler
*/
*(bakery_lock)
*(tzfw_coherent_mem)
__COHERENT_RAM_END_UNALIGNED__ = .;
/*
* Memory page(s) mapped to this section will be marked
* as device memory. No other unexpected data must creep in.
* Ensure the rest of the current memory page is unused.
*/
. = ALIGN(PAGE_SIZE);
__COHERENT_RAM_END__ = .;
} >RAM
#endif
Make generic code work in presence of system caches On the ARMv8 architecture, cache maintenance operations by set/way on the last level of integrated cache do not affect the system cache. This means that such a flush or clean operation could result in the data being pushed out to the system cache rather than main memory. Another CPU could access this data before it enables its data cache or MMU. Such accesses could be serviced from the main memory instead of the system cache. If the data in the sysem cache has not yet been flushed or evicted to main memory then there could be a loss of coherency. The only mechanism to guarantee that the main memory will be updated is to use cache maintenance operations to the PoC by MVA(See section D3.4.11 (System level caches) of ARMv8-A Reference Manual (Issue A.g/ARM DDI0487A.G). This patch removes the reliance of Trusted Firmware on the flush by set/way operation to ensure visibility of data in the main memory. Cache maintenance operations by MVA are now used instead. The following are the broad category of changes: 1. The RW areas of BL2/BL31/BL32 are invalidated by MVA before the C runtime is initialised. This ensures that any stale cache lines at any level of cache are removed. 2. Updates to global data in runtime firmware (BL31) by the primary CPU are made visible to secondary CPUs using a cache clean operation by MVA. 3. Cache maintenance by set/way operations are only used prior to power down. NOTE: NON-UPSTREAM TRUSTED FIRMWARE CODE SHOULD MAKE EQUIVALENT CHANGES IN ORDER TO FUNCTION CORRECTLY ON PLATFORMS WITH SUPPORT FOR SYSTEM CACHES. Fixes ARM-software/tf-issues#205 Change-Id: I64f1b398de0432813a0e0881d70f8337681f6e9a
9 years ago
/*
* Define a linker symbol to mark end of the RW memory area for this
* image.
*/
__RW_END__ = .;
__BL31_END__ = .;
__BSS_SIZE__ = SIZEOF(.bss);
#if USE_COHERENT_MEM
__COHERENT_RAM_UNALIGNED_SIZE__ =
__COHERENT_RAM_END_UNALIGNED__ - __COHERENT_RAM_START__;
#endif
ASSERT(. <= BL31_LIMIT, "BL31 image has exceeded its limit.")
}