OSDN Git Service
Morten Rasmussen [Thu, 2 Jul 2015 16:16:34 +0000 (17:16 +0100)]
ANDROID: sched: Prevent unnecessary active balance of single task in sched group
Scenarios with the busiest group having just one task and the local
being idle on topologies with sched groups with different numbers of
cpus manage to dodge all load-balance bailout conditions resulting the
nr_balance_failed counter to be incremented. This eventually causes a
pointless active migration of the task. This patch prevents this by not
incrementing the counter when the busiest group only has one task.
ASYM_PACKING migrations and migrations due to reduced capacity should
still take place as these are explicitly captured by
need_active_balance().
A better solution would be to not attempt the load-balance in the first
place, but that requires significant changes to the order of bailout
conditions and statistics gathering.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I28f69c72febe0211decbe77b7bc3e48839d3d7b3
Morten Rasmussen [Thu, 28 Jun 2018 16:31:25 +0000 (17:31 +0100)]
FROMLIST: sched/core: Disable SD_PREFER_SIBLING on asymmetric cpu capacity domains
The 'prefer sibling' sched_domain flag is intended to encourage
spreading tasks to sibling sched_domain to take advantage of more caches
and core for SMT systems. It has recently been changed to be on all
non-NUMA topology level. However, spreading across domains with cpu
capacity asymmetry isn't desirable, e.g. spreading from high capacity to
low capacity cpus even if high capacity cpus aren't overutilized might
give access to more cache but the cpu will be slower and possibly lead
to worse overall throughput.
To prevent this, we need to remove SD_PREFER_SIBLING on the sched_domain
level immediately below SD_ASYM_CPUCAPACITY.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I8400c9a26d57da10e45b64c6f93bc1b99eb5c3c8
Morten Rasmussen [Thu, 28 Jun 2018 15:34:35 +0000 (16:34 +0100)]
FROMLIST: sched/core: Disable SD_ASYM_CPUCAPACITY for root_domains without asymmetry
When hotplugging cpus out or creating exclusive cpusets (disabling
sched_load_balance) systems which were asymmetric at boot might become
symmetric. In this case leaving the flag set might lead to suboptimal
scheduling decisions.
The arch-code proving the flag doesn't have visibility of the cpuset
configuration so it must either be told by passing a cpumask or the
generic topology code has to verify if the flag should still be set
when taking the actual sched_domain_span() into account. This patch
implements the latter approach.
We need to detect capacity based on calling arch_scale_cpu_capacity()
directly as rq->cpu_capacity_orig hasn't been set yet early in the boot
process.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I95dadee225f2217787713a61bd7b07e11e028fb0
Chris Redpath [Fri, 1 Jun 2018 11:15:45 +0000 (12:15 +0100)]
FROMLIST: sched/fair: Don't move tasks to lower capacity cpus unless necessary
When lower capacity CPUs are load balancing and considering to pull
something from a higher capacity group, we should not pull tasks from a
cpu with only one task running as this is guaranteed to impede progress
for that task. If there is more than one task running, load balance in
the higher capacity group would have already made any possible moves to
resolve imbalance and we should make better use of system compute
capacity by moving a task if we still have more than one running.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I66c47bc93f6533ae8f36f026c033ad5de80f3275
Valentin Schneider [Tue, 27 Feb 2018 16:56:41 +0000 (16:56 +0000)]
FROMLIST: sched/fair: Set rq->rd->overload when misfit
Idle balance is a great opportunity to pull a misfit task. However,
there are scenarios where misfit tasks are present but idle balance is
prevented by the overload flag.
A good example of this is a workload of n identical tasks. Let's suppose
we have a 2+2 Arm big.LITTLE system. We then spawn 4 fairly
CPU-intensive tasks - for the sake of simplicity let's say they are just
CPU hogs, even when running on big CPUs.
They are identical tasks, so on an SMP system they should all end at
(roughly) the same time. However, in our case the LITTLE CPUs are less
performing than the big CPUs, so tasks running on the LITTLEs will have
a longer completion time.
This means that the big CPUs will complete their work earlier, at which
point they should pull the tasks from the LITTLEs. What we want to
happen is summarized as follows:
a,b,c,d are our CPU-hogging tasks
_ signifies idling
LITTLE_0 | a a a a _ _
LITTLE_1 | b b b b _ _
---------|-------------
big_0 | c c c c a a
big_1 | d d d d b b
^
^
Tasks end on the big CPUs, idle balance happens
and the misfit tasks are pulled straight away
This however won't happen, because currently the overload flag is only
set when there is any CPU that has more than one runnable task - which
may very well not be the case here if our CPU-hogging workload is all
there is to run.
As such, this commit sets the overload flag in update_sg_lb_stats when
a group is flagged as having a misfit task.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: I86d5aab67bd2c91a0b193f21844a4634a492f331
Valentin Schneider [Tue, 27 Feb 2018 16:20:21 +0000 (16:20 +0000)]
FROMLIST: sched: Wrap rq->rd->overload accesses with READ/WRITE_ONCE
This variable can be read and set locklessly within update_sd_lb_stats().
As such, READ/WRITE_ONCE are added to make sure nothing terribly wrong
can happen because of the compiler.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I2b35baaebb9a55afa7877b049d2af1c255567601
Valentin Schneider [Tue, 27 Feb 2018 15:49:40 +0000 (15:49 +0000)]
FROMLIST: sched: Change root_domain->overload type to int
sizeof(_Bool) is implementation defined, so let's just go with 'int' as
is done for other structures e.g. sched_domain_shared->has_idle_cores.
The local 'overload' variable used in update_sd_lb_stats can remain
bool, as it won't impact any struct layout and can be assigned to the
root_domain field.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: I2d90d5cde12356ede87cb77cd155a74eae4b0e54
Valentin Schneider [Fri, 29 Jun 2018 13:50:31 +0000 (14:50 +0100)]
FROMLIST: sched/fair: Change prefer_sibling type to bool
This variable is entirely local to update_sd_lb_stats, so we can
safely change its type and slightly clean up its initialisation.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: I386c96732ca8499ef056c59cb49f399bca062e8c
Valentin Schneider [Thu, 15 Feb 2018 16:20:52 +0000 (16:20 +0000)]
FROMLIST: sched/fair: Kick nohz balance if rq->misfit_task
There already are a few conditions in nohz_kick_needed() to ensure
a nohz kick is triggered, but they are not enough for some misfit
task scenarios. Excluding asym packing, those are:
* rq->nr_running >=2: Not relevant here because we are running a
misfit task, it needs to be migrated regardless and potentially through
active balance.
* sds->nr_busy_cpus > 1: If there is only the misfit task being run
on a group of low capacity cpus, this will be evaluated to False.
* rq->cfs.h_nr_running >=1 && check_cpu_capacity(): Not relevant here,
misfit task needs to be migrated regardless of rt/IRQ pressure
As such, this commit adds an rq->misfit_task condition to trigger a
nohz kick.
The idea to kick a nohz balance for misfit tasks originally came from
Leo Yan <leo.yan@linaro.org>, and a similar patch was submitted for
the Android Common Kernel - see [1].
[1]: https://lists.linaro.org/pipermail/eas-dev/2016-September/000551.html
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: Ieddf2ff6553e08b52dafb006014db61f8789b43f
Morten Rasmussen [Thu, 1 Feb 2018 14:56:16 +0000 (14:56 +0000)]
FROMLIST: sched/fair: Consider misfit tasks when load-balancing
On asymmetric cpu capacity systems load intensive tasks can end up on
cpus that don't suit their compute demand. In this scenarios 'misfit'
tasks should be migrated to cpus with higher compute capacity to ensure
better throughput. group_misfit_task indicates this scenario, but tweaks
to the load-balance code are needed to make the migrations happen.
Misfit balancing only makes sense between a source group of lower
per-cpu capacity and destination group of higher compute capacity.
Otherwise, misfit balancing is ignored. group_misfit_task has lowest
priority so any imbalance due to overload is dealt with first.
The modifications are:
1. Only pick a group containing misfit tasks as the busiest group if the
destination group has higher capacity and has spare capacity.
2. When the busiest group is a 'misfit' group, skip the usual average
load and group capacity checks.
3. Set the imbalance for 'misfit' balancing sufficiently high for a task
to be pulled ignoring average load.
4. Pick the cpu with the highest misfit load as the source cpu.
5. If the misfit task is alone on the source cpu, go for active
balancing.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: Ic28f3295ba7115dee774c3ad6576866ee985aee5
Morten Rasmussen [Wed, 16 May 2018 12:35:32 +0000 (13:35 +0100)]
FROMLIST: sched: Add sched_group per-cpu max capacity
The current sg->min_capacity tracks the lowest per-cpu compute capacity
available in the sched_group when rt/irq pressure is taken into account.
Minimum capacity isn't the ideal metric for tracking if a sched_group
needs offloading to another sched_group for some scenarios, e.g. a
sched_group with multiple cpus if only one is under heavy pressure.
Tracking maximum capacity isn't perfect either but a better choice for
some situations as it indicates that the sched_group definitely compute
capacity constrained either due to rt/irq pressure on all cpus or
asymmetric cpu capacities (e.g. big.LITTLE).
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I227a3c118a30993da30ff2164c531d0df9039f32
Morten Rasmussen [Fri, 17 Jul 2015 15:45:07 +0000 (16:45 +0100)]
FROMLIST: sched/fair: Add group_misfit_task load-balance type
To maximize throughput in systems with asymmetric cpu capacities (e.g.
ARM big.LITTLE) load-balancing has to consider task and cpu utilization
as well as per-cpu compute capacity when load-balancing in addition to
the current average load based load-balancing policy. Tasks with high
utilization that are scheduled on a lower capacity cpu need to be
identified and migrated to a higher capacity cpu if possible to maximize
throughput.
To implement this additional policy an additional group_type
(load-balance scenario) is added: group_misfit_task. This represents
scenarios where a sched_group has one or more tasks that are not
suitable for its per-cpu capacity. group_misfit_task is only considered
if the system is not overloaded or imbalanced (group_imbalanced or
group_overloaded).
Identifying misfit tasks requires the rq lock to be held. To avoid
taking remote rq locks to examine source sched_groups for misfit tasks,
each cpu is responsible for tracking misfit tasks themselves and update
the rq->misfit_task flag. This means checking task utilization when
tasks are scheduled and on sched_tick.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I4c746a536a5ffd65d9102f0b6c09ca573c57f575
Morten Rasmussen [Thu, 1 Feb 2018 11:44:13 +0000 (11:44 +0000)]
FROMLIST: sched: Add static_key for asymmetric cpu capacity optimizations
The existing asymmetric cpu capacity code should cause minimal overhead
for others. Putting it behind a static_key, it has been done for SMT
optimizations, would make it easier to extend and improve without
causing harm to others moving forward.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Change-Id: I6c87bd086f1e3fada16d8c39257f8d02b41fa794
Quentin Perret [Fri, 10 Nov 2017 12:17:34 +0000 (12:17 +0000)]
FROMLIST: sched/fair: Select an energy-efficient CPU on task wake-up
If an Energy Model (EM) is available and if the system isn't
overutilized, re-route waking tasks into an energy-aware placement
algorithm. The selection of an energy-efficient CPU for a task
is achieved by estimating the impact on system-level active energy
resulting from the placement of the task on the CPU with the highest
spare capacity in each frequency domain. This strategy spreads tasks in
a frequency domain and avoids overly aggressive task packing. The best
CPU energy-wise is then selected if it saves a large enough amount of
energy with respect to prev_cpu.
Although it has already shown significant benefits on some existing
targets, this approach cannot scale to platforms with numerous CPUs.
This is an attempt to do something useful as writing a fast heuristic
that performs reasonably well on a broad spectrum of architectures isn't
an easy task. As such, the scope of usability of the energy-aware
wake-up path is restricted to systems with the SD_ASYM_CPUCAPACITY flag
set, and where the EM isn't too complex.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: I7d0559ecb55bd88c84458086595a794ceccf3825
Quentin Perret [Fri, 10 Nov 2017 11:20:03 +0000 (11:20 +0000)]
FROMLIST: sched/fair: Introduce an energy estimation helper function
In preparation for the definition of an energy-aware wakeup path,
introduce a helper function to estimate the consequence on system energy
when a specific task wakes-up on a specific CPU. compute_energy()
estimates the capacity state to be reached by all frequency domains and
estimates the consumption of each online CPU according to its Energy
Model and its percentage of busy time.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: I52357c29be56a411e1f875e0de7a254a2f3bd8cc
Morten Rasmussen [Sat, 9 May 2015 15:49:57 +0000 (16:49 +0100)]
FROMLIST: sched: Add over-utilization/tipping point indicator
Energy-aware scheduling is only meant to be active while the system is
_not_ over-utilized. That is, there are spare cycles available to shift
tasks around based on their actual utilization to get a more
energy-efficient task distribution without depriving any tasks. When
above the tipping point task placement is done the traditional way based
on load_avg, spreading the tasks across as many cpus as possible based
on priority scaled load to preserve smp_nice. Below the tipping point we
want to use util_avg instead. We need to define a criteria for when we
make the switch.
The util_avg for each cpu converges towards 100% (1024) regardless of
how many task additional task we may put on it. If we define
over-utilized as:
sum_{cpus}(rq.cfs.avg.util_avg) + margin > sum_{cpus}(rq.capacity)
some individual cpus may be over-utilized running multiple tasks even
when the above condition is false. That should be okay as long as we try
to spread the tasks out to avoid per-cpu over-utilization as much as
possible and if all tasks have the _same_ priority. If the latter isn't
true, we have to consider priority to preserve smp_nice.
For example, we could have n_cpus nice=-10 util_avg=55% tasks and
n_cpus/2 nice=0 util_avg=60% tasks. Balancing based on util_avg we are
likely to end up with nice=-10 tasks sharing cpus and nice=0 tasks
getting their own as we 1.5*n_cpus tasks in total and 55%+55% is less
over-utilized than 55%+60% for those cpus that have to be shared. The
system utilization is only 85% of the system capacity, but we are
breaking smp_nice.
To be sure not to break smp_nice, we have defined over-utilization
conservatively as when any cpu in the system is fully utilized at it's
highest frequency instead:
cpu_rq(any).cfs.avg.util_avg + margin > cpu_rq(any).capacity
IOW, as soon as one cpu is (nearly) 100% utilized, we switch to load_avg
to factor in priority to preserve smp_nice.
With this definition, we can skip periodic load-balance as no cpu has an
always-running task when the system is not over-utilized. All tasks will
be periodic and we can balance them at wake-up. This conservative
condition does however mean that some scenarios that could benefit from
energy-aware decisions even if one cpu is fully utilized would not get
those benefits.
For system where some cpus might have reduced capacity on some cpus
(RT-pressure and/or big.LITTLE), we want periodic load-balance checks as
soon a just a single cpu is fully utilized as it might one of those with
reduced capacity and in that case we want to migrate it.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: I93a4512b1fc9fa3caa5e7d8940c38d2ad1dac0e3
Quentin Perret [Thu, 21 Jun 2018 14:34:57 +0000 (15:34 +0100)]
FROMLIST: sched/topology: Introduce sched_energy_present static key
In order to ensure a minimal performance impact on non-energy-aware
systems, introduce a static_key guarding the access to Energy-Aware
Scheduling (EAS) code.
The static key is set iff all the following conditions are met for at
least one root domain:
1. all online CPUs of the root domain are covered by the Energy
Model (EM);
2. the complexity of the root domain's EM is low enough to keep
scheduling overheads low;
3. the root domain has an asymmetric CPU capacity topology (detected
by looking for the SD_ASYM_CPUCAPACITY flag in the sched_domain
hierarchy).
The static key is checked in the rd_freq_domain() function which returns
the frequency domains of a root domain when they are available. As EAS
cannot be enabled with CONFIG_ENERGY_MODEL=n, rd_freq_domain() is
stubbed to 'NULL' to let the compiler remove the unused EAS code by
constant propagation.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com
Change-Id: Ie8aeb0ac089a5b2a4e51105f70fe97dc51d05c1d
Quentin Perret [Fri, 18 May 2018 09:16:33 +0000 (10:16 +0100)]
FROMLIST: sched/topology: Lowest energy aware balancing sched_domain level pointer
Add another member to the family of per-cpu sched_domain shortcut
pointers. This one, sd_ea, points to the lowest level at which energy
aware scheduling should be used.
Generally speaking, the largest opportunity to save energy via scheduling
comes from a smarter exploitation of heterogeneous platforms (i.e.
big.LITTLE). Consequently, the sd_ea shortcut is wired to the lowest
scheduling domain at which the SD_ASYM_CPUCAPACITY flag is set. For
example, it is possible to apply energy-aware scheduling within a socket
on a multi-socket system, as long as each socket has an asymmetric
topology. Cross-sockets wake-up balancing will only happen when the
system is over-utilized, or this_cpu and prev_cpu are in different
sockets.
cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: I8100e06dc50dd4bd98960e73b51e72c7994740c2
Quentin Perret [Mon, 30 Apr 2018 10:23:39 +0000 (11:23 +0100)]
FROMLIST: sched/topology: Reference the Energy Model of CPUs when available
The existing scheduling domain hierarchy is defined to map to the cache
topology of the system. However, Energy Aware Scheduling (EAS) requires
more knowledge about the platform, and specifically needs to know about
the span of Frequency Domains (FD), which do not always align with
caches.
To address this issue, use the Energy Model (EM) of the system to extend
the scheduler topology code with a representation of the FDs, alongside
the scheduling domains. More specifically, a linked list of FDs is
attached to each root domain. When multiple root domains are in use,
each list contains only the FDs covering the CPUs of its root domain. If
a FD spans over CPUs of two different root domains, it will be
duplicated in both lists.
The lists are fully maintained by the scheduler from
partition_sched_domains() in order to cope with hotplug and cpuset
changes. As for scheduling domains, the list are protected by RCU to
ensure safe concurrent updates.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: Ibdb9d735b44006c3a027d95bd461a9132da9e41a
Quentin Perret [Thu, 17 May 2018 14:50:07 +0000 (15:50 +0100)]
FROMLIST: PM / EM: Expose the Energy Model in sysfs
Expose the Energy Model (read-only) of all frequency domains in sysfs
for convenience. To do so, add a kobject to the CPU subsystem under the
umbrella of which a kobject for each frequency domain is attached.
The resulting hierarchy is as follows for a platform with two frequency
domains for example:
/sys/devices/system/cpu/energy_model
├── fd0
│ ├── capacity
│ ├── cpus
│ ├── frequency
│ └── power
└── fd4
├── capacity
├── cpus
├── frequency
└── power
In this implementation, the kobject abstraction is only used as a
convenient way of exposing data to sysfs. However, it could also be
used in the future to allocate and release frequency domains in a more
dynamic way using reference counting.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: I90775764f3ed1bbbf1f490f4ebed665570acfe4a
Quentin Perret [Wed, 25 Apr 2018 17:34:11 +0000 (18:34 +0100)]
FROMLIST: PM: Introduce an Energy Model management framework
Several subsystems in the kernel (task scheduler and/or thermal at the
time of writing) can benefit from knowing about the energy consumed by
CPUs. Yet, this information can come from different sources (DT or
firmware for example), in different formats, hence making it hard to
exploit without a standard API.
As an attempt to address this, introduce a centralized Energy Model
(EM) management framework which aggregates the power values provided
by drivers into a table for each frequency domain in the system. The
power cost tables are made available to interested clients (e.g. task
scheduler or thermal) via platform-agnostic APIs. The overall design
is represented by the diagram below (focused on Arm-related drivers as
an example, but hopefully applicable to any architecture):
+---------------+ +-----------------+ +---------+
| Thermal (IPA) | | Scheduler (EAS) | | Other ? |
+---------------+ +-----------------+ +---------+
| | em_fd_energy() |
| | em_cpu_get() |
+-----------+ | +--------+
| | |
v v v
+---------------------+
| | +---------------+
| Energy Model | | arch_topology |
| |<--------| driver |
| Framework | +---------------+
| | em_rescale_cpu_capacity()
+---------------------+
^ ^ ^
| | | em_register_freq_domain()
+----------+ | +---------+
| | |
+---------------+ +---------------+ +--------------+
| cpufreq-dt | | arm_scmi | | Other |
+---------------+ +---------------+ +--------------+
^ ^ ^
| | |
+--------------+ +---------------+ +--------------+
| Device Tree | | Firmware | | ? |
+--------------+ +---------------+ +--------------+
Drivers (typically, but not limited to, CPUFreq drivers) can register
data in the EM framework using the em_register_freq_domain() API. The
calling driver must provide a callback function with a standardized
signature that will be used by the EM framework to build the power
cost tables of the frequency domain. This design should offer a lot of
flexibility to calling drivers which are free of reading information
from any location and to use any technique to compute power costs.
Moreover, the capacity states registered by drivers in the EM framework
are not required to match real performance states of the target. This
is particularly important on targets where the performance states are
not known by the OS.
On the client side, the EM framework offers APIs to access the power
cost tables of a CPU (em_cpu_get()), and to estimate the energy
consumed by the CPUs of a frequency domain (em_fd_energy()). Clients
such as the task scheduler can then use these APIs to access the shared
data structures holding the Energy Model of CPUs.
The EM framework also provides an API (em_rescale_cpu_capacity()) to
re-scale the capacity values of the model asynchronously, after it has
been created. This is required for architectures where the capacity
scale factor of CPUs can change at run-time. This is the case for
Arm/Arm64 for example where the arch_topology driver recomputes the
capacity scale factors of the CPUs after the maximum frequency of all
CPUs has been discovered. Although complex, the process of creating and
re-scaling the EM has to be kept in two separate steps to fulfill the
needs of the different users. The thermal subsystem doesn't use the
capacity values and shouldn't have dependencies on subsystems providing
them. On the other hand, the task scheduler needs the capacity values,
and it will benefit from seeing them up-to-date when applicable.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: Ia3323ee10827d0c4124329261c048e0632a8a9af
Quentin Perret [Fri, 27 Apr 2018 14:05:47 +0000 (15:05 +0100)]
FROMLIST: sched/cpufreq: Factor out utilization to frequency mapping
The schedutil governor maps utilization values to frequencies by applying
a 25% margin. Since this sort of mapping mechanism can be needed by other
users (i.e. EAS), factor the utilization-to-frequency mapping code out
of schedutil and move it to include/linux/sched/cpufreq.h to avoid code
duplication. The new map_util_freq() function is inlined to avoid
overheads.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: I978019574f5d0bb30a12b87df6652a59372f7b71
Quentin Perret [Wed, 25 Apr 2018 13:12:58 +0000 (14:12 +0100)]
FROMLIST: sched: Relocate arch_scale_cpu_capacity
By default, arch_scale_cpu_capacity() is only visible from within the
kernel/sched folder. Relocate it to include/linux/sched/topology.h to
make it visible to other clients needing to know about the capacity of
CPUs, such as the Energy Model framework.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Change-Id: I84e9681206d89d0c3a624999af40a6a5e57fa8dd
Amit Pundir [Thu, 2 Aug 2018 11:01:08 +0000 (16:31 +0530)]
RFC: ANDROID: dm: android-verity: Build fixes for v4.18
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Amit Pundir [Mon, 9 Jul 2018 06:50:13 +0000 (12:20 +0530)]
RFC: ANDROID: net: ipv6: Flip FIB entries to fib6_info
Convert all code paths referencing a FIB entry from rt6_info
to fib6_info. Align with upstream commit
8d1c802b2815
("net/ipv6: Flip FIB entries to fib6_info") changes.
Fixes: Change-Id: I82d16e3737d9cdfa6489e649e247894d0d60cbb1
("ANDROID: net: ipv6: autoconf routes into per-device tables")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Amit Pundir [Mon, 9 Jul 2018 05:20:01 +0000 (10:50 +0530)]
RFC: ANDROID: proc/uid: switch instantiate_t to d_splice_alias()
Fix proc_fill_cache() now that upstream commit
0168b9e38c42
("procfs: switch instantiate_t to d_splice_alias()"),
switched instantiate() callback to d_splice_alias().
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Amit Pundir [Fri, 9 Mar 2018 14:21:44 +0000 (19:51 +0530)]
RFC: ANDROID: fs: sdcardfs: Use inode iversion helpers
Upstream commit
f02a9ad1f15d ("fs: handle inode->i_version more efficiently")
converted the inode -> i_version counter to an atomic64_t. So move to using
relevant iversion helper routines for basic operations instead.
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Amit Pundir [Mon, 8 Jan 2018 17:25:09 +0000 (22:55 +0530)]
RFC: ANDROID: net: ipv4: sysfs_net_ipv4: Fix TCP window size controlling knobs
Refactor /sys/kernel/ipv4/tcp_* send/receive window size controlling
knobs to align with the changes of upstream commit
356d1833b638
("tcp: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem").
Fixes: ("ANDROID: net: ipv4: sysfs_net_ipv4: Add sysfs-based knobs for controlling TCP window size")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Amit Pundir [Mon, 8 Jan 2018 17:21:52 +0000 (22:51 +0530)]
RFC: ANDROID: net: ipv4: tcp: Namespace-ify sysctl_tcp_default_init_rwnd
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Chenbo Feng [Wed, 29 Nov 2017 02:22:11 +0000 (18:22 -0800)]
ANDROID: qtaguid: Fix the UAF probelm with tag_ref_tree
When multiple threads is trying to tag/delete the same socket at the
same time, there is a chance the tag_ref_entry of the target socket to
be null before the uid_tag_data entry is freed. It is caused by the
ctrl_cmd_tag function where it doesn't correctly grab the spinlocks
when tagging a socket.
Signed-off-by: Chenbo Feng <fengc@google.com>
Bug:
65853158
Change-Id: I5d89885918054cf835370a52bff2d693362ac5f0
Daniel Rosenberg [Thu, 26 Jul 2018 23:32:09 +0000 (16:32 -0700)]
ANDROID: sdcardfs: Check stacked filesystem depth
bug:
111860541
Change-Id: Ia0a30b2b8956c4ada28981584cd8647713a1e993
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Alistair Strachan [Fri, 27 Jul 2018 16:18:28 +0000 (09:18 -0700)]
ANDROID: verity: really fix android-verity Kconfig
The change "ANDROID: verity: fix android-verity Kconfig dependencies"
relaxed the dependency on DM_VERITY=y to just DM_VERITY, but this is not
correct because there are parts of the verity and dm-mod API that
android-verity is using but which are not exported to modules.
Work around this problem by disallowing android-verity to be built-in
when the dm/verity core is built modularly.
Bug:
72722987
Change-Id: I3cfaa2acca8e4a4b5c459afdddd958ac9f8c1eb3
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Wed, 25 Jul 2018 23:11:09 +0000 (16:11 -0700)]
x86_64_cuttlefish_defconfig: Enable android-verity
Bug:
72722987
Test: Build & boot with x86_64_cuttlefish_defconfig
Change-Id: I961e6aaa944b5ab0c005cb39604a52f8dc98fb06
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Wed, 25 Jul 2018 23:11:38 +0000 (16:11 -0700)]
x86_64_cuttlefish_defconfig: enable verity cert
Bug:
72722987
Test: Build, boot and verify in /proc/keys
Change-Id: Ia55b94d56827003a88cb6083a75340ee31347470
Signed-off-by: Alistair Strachan <astrachan@google.com>
Sandeep Patil [Tue, 24 Jul 2018 23:59:40 +0000 (16:59 -0700)]
ANDROID: android-verity: Fix broken parameter handling.
android-verity documentation states that the target expectets
the key, followed by the backing device on the commandline as follows
"dm=system none ro,0 1 android-verity <public-key-id> <backing-partition>"
However, the code actually expects the backing device as the first
parameter. Fix that.
Bug:
72722987
Change-Id: Ibd56c0220f6003bdfb95aa2d611f787e75a65c97
Signed-off-by: Sandeep Patil <sspatil@google.com>
Sandeep Patil [Mon, 23 Jul 2018 23:31:32 +0000 (16:31 -0700)]
ANDROID: android-verity: Make it work with newer kernels
Fixed bio API calls as they changed from 4.4 to 4.9.
Fixed the driver to use the new verify_signature_one() API.
Remove the dead code resulted from the rebase.
Bug:
72722987
Test: Build and boot hikey with system partition mounted as root using
android-verity
Signed-off-by: Sandeep Patil <sspatil@google.com>
Change-Id: I1e29111d57b62f0451404c08d49145039dd00737
Sandeep Patil [Tue, 24 Jul 2018 16:40:07 +0000 (09:40 -0700)]
ANDROID: android-verity: Add API to verify signature with builtin keys.
The builtin keyring was exported prior to this which allowed
android-verity to simply lookup the key in the builtin keyring and
verify the signature of the verity metadata.
This is now broken as the kernel expects the signature to be
in pkcs#7 format (same used for module signing). Obviously, this doesn't
work with the verity metadata as we just append the raw signature in the
metadata .. sigh.
*This one time*, add an API to accept arbitrary signature and verify
that with a key from system's trusted keyring.
Bug:
72722987
Test:
$ adb push verity_fs.img /data/local/tmp/
$ adb root && adb shell
> cd /data/local/tmp
> losetup /dev/block/loop0 verity_fs.img
> dmctl create verity-fs android-verity 0 4200 Android:#
7e4333f9bba00adfe0ede979e28ed1920492b40f 7:0
> mount -t ext4 /dev/block/dm-0 temp/
> cat temp/foo.txt temp/bar.txt
Change-Id: I0c14f3cb2b587b73a4c75907367769688756213e
Signed-off-by: Sandeep Patil <sspatil@google.com>
Sandeep Patil [Wed, 18 Jul 2018 14:16:42 +0000 (07:16 -0700)]
ANDROID: verity: fix android-verity Kconfig dependencies
Bug:
72722987
Test: Android verity now shows up in 'make menuconfig'
Change-Id: I21c2f36c17f45e5eb0daa1257f5817f9d56527e7
Signed-off-by: Sandeep Patil <sspatil@google.com>
Pavankumar Kondeti [Wed, 21 Jun 2017 03:52:45 +0000 (09:22 +0530)]
ANDROID: uid_sys_stats: Replace tasklist lock with RCU in uid_cputime_show
Tasklist lock is acuquired in uid_cputime_show for updating the stats
for all tasks in the system. This can potentially disable preemption
for several milli seconds. Replace tasklist_lock with RCU read side
primitives.
Change-Id: Ife69cb577bfdceaae6eb21b9bda09a0fe687e140
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Daniel Rosenberg [Mon, 29 May 2017 23:38:16 +0000 (16:38 -0700)]
ANDROID: mnt: Fix next_descendent
next_descendent did not properly handle the case
where the initial mount had no slaves. In this case,
we would look for the next slave, but since don't
have a master, the check for wrapping around to the
start of the list will always fail. Instead, we check
for this case, and ensure that we end the iteration
when we come back to the root.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
62094374
Change-Id: I43dfcee041aa3730cb4b9a1161418974ef84812e
Sultan Alsawaf [Sun, 3 Jun 2018 17:47:51 +0000 (10:47 -0700)]
ANDROID: Fix massive cpufreq_times memory leaks
Every time _cpu_up() is called for a CPU, idle_thread_get() is called
which then re-initializes a CPU's idle thread that was already
previously created and cached in a global variable in
smpboot.c. idle_thread_get() calls init_idle() which then calls
__sched_fork(). __sched_fork() is where cpufreq_task_times_init() is,
and cpufreq_task_times_init() allocates memory for the task struct's
time_in_state array.
Since idle_thread_get() reuses a task struct instance that was already
previously created, this means that every time it calls init_idle(),
cpufreq_task_times_init() allocates this array again and overwrites
the existing allocation that the idle thread already had.
This causes memory to be leaked every time a CPU is onlined. In order
to fix this, move allocation of time_in_state into _do_fork to avoid
allocating it at all for idle threads. The cpufreq times interface is
intended to be used for tracking userspace tasks, so we can safely
remove it from the kernel's idle threads without killing any
functionality.
But that's not all!
Task structs can be freed outside of release_task(), which creates
another memory leak because a task struct can be freed without having
its cpufreq times allocation freed. To fix this, free the cpufreq
times allocation at the same time that task struct allocations are
freed, in free_task().
Since free_task() can also be called in error paths of copy_process()
after dup_task_struct(), set time_in_state to NULL immediately after
calling dup_task_struct() to avoid possible double free.
Bug description and fix adapted from patch submitted by
Sultan Alsawaf <sultanxda@gmail.com> at
https://android-review.googlesource.com/c/kernel/msm/+/700134
Bug:
110044919
Test: Hikey960 builds, boots & reports /proc/<pid>/time_in_state
correctly
Change-Id: I12fe7611fc88eb7f6c39f8f7629ad27b6ec4722c
Signed-off-by: Connor O'Brien <connoro@google.com>
Connor O'Brien [Fri, 13 Jul 2018 21:31:40 +0000 (14:31 -0700)]
ANDROID: Reduce use of #ifdef CONFIG_CPU_FREQ_TIMES
Add empty versions of functions to cpufreq_times.h to cut down on use
of #ifdef in .c files.
Test: kernel builds with and without CONFIG_CPU_FREQ_TIMES=y
Change-Id: I49ac364fac3d42bba0ca1801e23b15081094fb12
Signed-off-by: Connor O'Brien <connoro@google.com>
Lianjun Huang [Sat, 16 Jun 2018 14:59:46 +0000 (22:59 +0800)]
ANDROID: sdcardfs: fix potential crash when reserved_mb is not zero
sdcardfs_mkdir() calls check_min_free_space(). When reserved_mb is not zero, a negative dentry will be passed to
ext4_statfs() at last and ext4_statfs() will crash. The parent dentry is positive. So we use the parent dentry to
check free space.
Change-Id: I80ab9623fe59ba911f4cc9f0e029a1c6f7ee421b
Signed-off-by: Lianjun Huang <huanglianjun@vivo.com>
Nathan Chancellor [Sun, 1 Apr 2018 03:56:23 +0000 (20:56 -0700)]
ANDROID: xt_qtaguid: Remove unnecessary null checks to device's name
'name' will never be NULL since it isn't a plain pointer but an array
of char values.
../net/netfilter/xt_qtaguid.c:1195:27: warning: address of array
'(*el_dev)->name' will always evaluate to 'true'
[-Wpointer-bool-conversion]
if (unlikely(!(*el_dev)->name)) {
~~~~~~~~~~~~^~~~
Change-Id: If3b25f17829b43e8a639193fb9cd04ae45947200
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
(cherry picked from android-4.4 commit
207b579e3db6fd0cb6fe40ba3e929635ad748d89)
Signed-off-by: Chenbo Feng <fengc@google.com>
Patrik Torstensson [Fri, 13 Apr 2018 22:34:48 +0000 (15:34 -0700)]
ANDROID: Add kconfig to make dm-verity check_at_most_once default enabled
This change adds a kernel config for default enable
the check_at_most_once dm-verity option. This is to give us
the ability to enforce the usage of at_most_once
for entry-level phones.
Change-Id: Id40416672c4c2209a9866997d8c164b5de5dc7dc
Signed-off-by: Patrik Torstensson <totte@google.com>
Bug:
72664474
Rik van Riel [Thu, 1 Sep 2011 19:26:50 +0000 (15:26 -0400)]
ANDROID: add extra free kbytes tunable
Add a userspace visible knob to tell the VM to keep an extra amount
of memory free, by increasing the gap between each zone's min and
low watermarks.
This is useful for realtime applications that call system
calls and have a bound on the number of allocations that happen
in any short time period. In this application, extra_free_kbytes
would be left at an amount equal to or larger than than the
maximum number of allocations that happen in any burst.
It may also be useful to reduce the memory use of virtual
machines (temporarily?), in a way that does not cause memory
fragmentation like ballooning does.
[ccross]
Revived for use on old kernels where no other solution exists.
The tunable will be removed on kernels that do better at avoiding
direct reclaim.
[surenb]
Will be reverted as soon as Android framework is reworked to
use upstream-supported watermark_scale_factor instead of
extra_free_kbytes.
Bug:
86445363
Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e
Signed-off-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Alistair Strachan [Thu, 31 May 2018 22:47:36 +0000 (15:47 -0700)]
ANDROID: x86_64_cuttlefish_defconfig: Enable F2FS
Bug:
80475502
Change-Id: I061467404f1d4b828ac1b7423db375a35934ce28
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Thu, 31 May 2018 20:36:29 +0000 (13:36 -0700)]
ANDROID: Update x86_64_cuttlefish_defconfig
Merge with the configs from kernel/configs.git added recently.
This should fix ipsec VPN functionality.
Bug:
80540078
Change-Id: I9cc99f5e34d2809670fe2fc0df121610657f6769
Signed-off-by: Alistair Strachan <astrachan@google.com>
Connor O'Brien [Wed, 23 May 2018 20:00:23 +0000 (13:00 -0700)]
ANDROID: proc: fix undefined behavior in proc_uid_base_readdir
When uid_base_stuff has no entries, proc_uid_base_readdir tries to
compute an address before the start of the array. Revise this check to
use uid_base_stuff + nents instead, which makes the code valid
regardless of array size.
Bug:
80158484
Test: No more compiler warning with CONFIG_CPU_FREQ_TIMES=n
Change-Id: I6e55b27c3ba8210cee194f6d27bbd62c0b263796
Signed-off-by: Connor O'Brien <connoro@google.com>
Alistair Strachan [Wed, 23 May 2018 00:38:34 +0000 (17:38 -0700)]
x86: vdso: Fix leaky vdso linker with CC=clang.
The vdso{32,64}.so can fail to build when CC=clang when clang tries to
find a suitable GCC toolchain to link these libraries with.
/usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782)
This happens because the host environment leaked into the CROSS_COMPILE
environment due to the way clang searches for suitable GCC toolchains.
Most of the time this goes unnoticed because the host linker is new
enough to work anyway, but on this particular machine it was not.
Extract the needed --target and --gcc-toolchain flags added in the top
level Makefile from KBUILD_CFLAGS.
Bug:
63889157
Change-Id: If7d4097d1d2eaf95f18d0295483bde8792a06844
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Tue, 22 May 2018 23:35:04 +0000 (16:35 -0700)]
ANDROID: x86_64_cuttlefish_defconfig: Disable ORC unwinder.
Disable the ORC unwinder. This feature requires libelf-dev, which is
breaking some automated build systems that do not have it installed.
As we already enabled CONFIG_FRAME_POINTER, we already incurred the
performance penalty of the legacy stack unwinder, so this is pretty
much a no-op change.
Bug:
63889157
Change-Id: Ic0704ebb726c97449ed873556262cc0db3e9a6cf
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Wed, 23 May 2018 01:01:41 +0000 (18:01 -0700)]
ANDROID: build: cuttlefish: Upgrade clang to newer version.
The last upgrade introduced a new build failure, because it had a bug
which caused it to emit PLT relocations, certain types of which cannot
be handled by the reloc tool in the kernel.
See https://bugs.llvm.org/show_bug.cgi?id=36674 for more details.
Bug:
63889157
Change-Id: I813febdbacb0579abcb12dc7f2164cce1e2f5a26
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Tue, 22 May 2018 23:09:23 +0000 (16:09 -0700)]
ANDROID: build: cuttlefish: Upgrade clang to newer version.
Use the same clang version as hikey-linaro.
Bug:
63889157
Change-Id: I6932d6149642d429086207e63aa8a8d5c2afd6f7
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Tue, 22 May 2018 18:41:31 +0000 (11:41 -0700)]
ANDROID: build: cuttlefish: Fix path to clang.
Reconcile with changes made to the kernel manifest. Clang must come from
master because it was not usable for kernel builds in older branches of
the Android platform.
Bug:
63889157
Change-Id: Id0a080fc2f1cba495f37f26afa48e43e736b756a
Signed-off-by: Alistair Strachan <astrachan@google.com>
Daniel Rosenberg [Wed, 25 Apr 2018 01:06:56 +0000 (18:06 -0700)]
ANDROID: sdcardfs: Don't d_drop in d_revalidate
After d_revalidate returns 0, the vfs will call
d_invalidate, which will call d_drop itself, along
with other cleanup.
Bug:
78262592
Change-Id: Idbb30e008c05d62edf2217679cb6a5517d8d1a2c
Signed-off-by: Daniel Rosenberg <drosen@google.com>
David Herrmann [Sat, 24 Oct 2015 23:20:18 +0000 (16:20 -0700)]
ANDROID: goldfish: drop CONFIG_INPUT_KEYCHORD
Remove keychord driver, replaced in user space by
https://android-review.googlesource.com/c/677629.
Signed-off-by: Mark Salyzyn <salyzyn@google.com>
Cc: Jin Qian <jinqian@google.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Bug:
64114943
Change-Id: I0b673a5c68dbe28afa033d2ca70e12daea144b2a
Wei Wang [Fri, 4 May 2018 05:16:10 +0000 (22:16 -0700)]
ANDROID: build.config: enforce trace_printk check
Bug:
79166848
Change-Id: I41d2fe57b377e305b4b68c30c98ee94643d142e4
Test: Build a kernel with trace_prink and see warning
Signed-off-by: Wei Wang <wvw@google.com>
Daniel Rosenberg [Thu, 12 Apr 2018 01:39:43 +0000 (18:39 -0700)]
ANDROID: sdcardfs: Set s_root to NULL after putting
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
77923821
Change-Id: I1705bfd146009561d2d1da5f0e6a342ec6932a1c
Daniel Rosenberg [Wed, 11 Apr 2018 23:24:51 +0000 (16:24 -0700)]
ANDROID: sdcardfs: d_make_root calls iput
d_make_root will call iput on failure, so we
shouldn't try to do that ourselves.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
77923821
Change-Id: I1abb4afb0f894ab917b7c6be8c833676f436beb7
Daniel Rosenberg [Wed, 11 Apr 2018 23:19:10 +0000 (16:19 -0700)]
ANDROID: sdcardfs: Check for private data earlier
When an sdcardfs dentry is destroyed, it may not yet
have its fsdata initialized. It must be checked before
we try to access the paths in its private data.
Additionally, when cleaning up the superblock after
a failure, we don't have our sb private data, so
check for that case.
Bug:
77923821
Change-Id: I89caf6e121ed86480b42024664453fe0031bbcf3
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Sami Tolvanen [Mon, 13 Nov 2017 16:19:17 +0000 (08:19 -0800)]
FROMLIST: arm64: kvm: use -fno-jump-tables with clang
Starting with LLVM r308050, clang generates a jump table with EL1
virtual addresses in __init_stage2_translation, which results in a
kernel panic when booting at EL2:
Kernel panic - not syncing: HYP panic:
PS:
800003c9 PC:
ffff0000089e6fd8 ESR:
86000004
FAR:
ffff0000089e6fd8 HPFAR:
0000000009825000 PAR:
0000000000000000
VCPU:
000804fc20001221
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc7-dirty #3
Hardware name: ARM Juno development board (r1) (DT)
Call trace:
[<
ffff000008088ea4>] dump_backtrace+0x0/0x34c
[<
ffff000008089208>] show_stack+0x18/0x20
[<
ffff0000089c73ec>] dump_stack+0xc4/0xfc
[<
ffff0000080c8e1c>] panic+0x138/0x2b4
[<
ffff0000080c8ce4>] panic+0x0/0x2b4
SMP: stopping secondary CPUs
SMP: failed to stop secondary CPUs 0-3,5
Kernel Offset: disabled
CPU features: 0x002086
Memory Limit: none
---[ end Kernel panic - not syncing: HYP panic:
PS:
800003c9 PC:
ffff0000089e6fd8 ESR:
86000004
FAR:
ffff0000089e6fd8 HPFAR:
0000000009825000 PAR:
0000000000000000
VCPU:
000804fc20001221
This change adds -fno-jump-tables to arm64/hyp to work around the
bug.
Bug:
62093296
Bug:
67506682
Change-Id: I1257be1febdcbfcc886fe6183c698b7a98d2a153
(am from https://patchwork.kernel.org/patch/
10060301/)
Suggested-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Alistair Strachan [Fri, 13 Apr 2018 22:36:37 +0000 (15:36 -0700)]
ANDROID: Add build server config for cuttlefish.
The build server config can be used with gcc or clang.
Specify CC=clang to build with clang.
Change-Id: Id346ab1489ecaaef8e9e66b084cc416dd0581f69
Signed-off-by: Alistair Strachan <astrachan@google.com>
Alistair Strachan [Fri, 6 Apr 2018 01:01:04 +0000 (18:01 -0700)]
ANDROID: Add defconfig for cuttlefish.
This file is based on x86_64_defconfig, merged with the base and
recommended configs from configs.git, with the virtio drivers enabled
and some spurious kernel features turned off.
Change-Id: I61bde941e8cfef2dd83cb4ff040f7380922cc44e
Signed-off-by: Alistair Strachan <astrachan@google.com>
Connor O'Brien [Tue, 23 Jan 2018 02:28:08 +0000 (18:28 -0800)]
ANDROID: cpufreq: Add time_in_state to /proc/uid directories
Add per-uid files that report the data in binary format rather than
text, to allow faster reading & parsing by userspace.
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug:
72339335
Test: compare values to those reported in /proc/uid_time_in_state
Change-Id: I463039ea7f17b842be4c70024fe772539fe2ce02
Connor O'Brien [Mon, 16 Oct 2017 17:30:24 +0000 (10:30 -0700)]
ANDROID: proc: Add /proc/uid directory
Add support for reporting per-uid information through procfs, roughly
following the approach used for per-tid and per-tgid directories in
fs/proc/base.c.
This also entails some new tracking of which uids have been used, to
avoid losing information when the last task with a given uid exits.
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug:
72339335
Test: ls /proc/uid/; compare with UIDs in /proc/uid_time_in_state
Change-Id: I0908f0c04438b11ceb673d860e58441bf503d478
Connor O'Brien [Tue, 6 Feb 2018 21:30:27 +0000 (13:30 -0800)]
ANDROID: cpufreq: times: track per-uid time in state
Add /proc/uid_time_in_state showing per uid/frequency/cluster
times. Allow uid removal through /proc/uid_cputime/remove_uid_range.
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug:
72339335
Test: Read /proc/uid_time_in_state
Change-Id: I20ba3546a27c25b7e7991e2a86986e158aafa58c
Connor O'Brien [Thu, 1 Feb 2018 02:11:57 +0000 (18:11 -0800)]
ANDROID: cpufreq: track per-task time in state
Add time in state data to task structs, and create
/proc/<pid>/time_in_state files to show how long each individual task
has run at each frequency.
Create a CONFIG_CPU_FREQ_TIMES option to enable/disable this tracking.
Signed-off-by: Connor O'Brien <connoro@google.com>
Bug:
72339335
Test: Read /proc/<pid>/time_in_state
Change-Id: Ia6456754f4cb1e83b2bc35efa8fbe9f8696febc8
Ritesh Harjani [Mon, 19 Mar 2018 10:33:09 +0000 (16:03 +0530)]
ANDROID: fuse: Add null terminator to path in canonical path to avoid issue
page allocated in fuse_dentry_canonical_path to be handled in
fuse_dev_do_write is allocated using __get_free_pages(GFP_KERNEL).
This may not return a page with data filled with 0. Now this
page may not have a null terminator at all.
If this happens and userspace fuse daemon screws up by passing a string
to kernel which is not NULL terminated (or did not fill anything),
then inside fuse driver in kernel when we try to do
strlen(fuse_dev_write->kern_path->getname_kernel)
on that page data -> it may give us issue with kernel paging request.
Unable to handle kernel paging request at virtual address
------------[ cut here ]------------
<..>
PC is at strlen+0x10/0x90
LR is at getname_kernel+0x2c/0xf4
<..>
strlen+0x10/0x90
kern_path+0x28/0x4c
fuse_dev_do_write+0x5b8/0x694
fuse_dev_write+0x74/0x94
do_iter_readv_writev+0x80/0xb8
do_readv_writev+0xec/0x1cc
vfs_writev+0x54/0x64
SyS_writev+0x64/0xe4
el0_svc_naked+0x24/0x28
To avoid this we should ensure in case of FUSE_CANONICAL_PATH,
the page is null terminated.
Change-Id: I33ca7cc76b4472eaa982c67bb20685df451121f5
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Bug:
75984715
[Daniel - small edit, using args size ]
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Ritesh Harjani [Mon, 19 Mar 2018 10:19:54 +0000 (15:49 +0530)]
ANDROID: sdcardfs: Fix sdcardfs to stop creating cases-sensitive duplicate entries.
sdcardfs_name_match gets a 'name' argument from the underlying FS.
This need not be null terminated string.
So in sdcardfs_name_match -> qstr_case_eq -> we should use
str_n_case_eq.
This happens because few of the entries in lower level FS may not be
NULL terminated and may have some garbage characters passed while
doing sdcardfs_name_match.
For e.g.
# dmesg |grep Download
[ 103.646386] sdcardfs_name_match: q1->name=.nomedia, q1->len=8,
q2->name=Download\x17\x80\x03, q2->len=8
[ 104.021340] sdcardfs_name_match: q1->name=.nomedia, q1->len=8,
q2->name=Download\x17\x80\x03, q2->len=8
[ 105.196864] sdcardfs_name_match: q1->name=.nomedia, q1->len=8,
q2->name=Download\x17\x80\x03, q2->len=8
[ 109.113521] sdcardfs_name_match: q1->name=logs, q1->len=4,
q2->name=Download\x17\x80\x03, q2->len=8
Now when we try to create a directory with different case for a such
files. SDCARDFS creates a entry if it could not find the underlying
entry in it's dcache.
To reproduce:-
1. bootup the device wait for some time after sdcardfs mounting to
complete.
2. cd /storage/emulated/0
3. echo 3 > /proc/sys/vm/drop_caches
4. mkdir download
We now start seeing two entries with name.
Download & download.
Change-Id: I976d92a220a607dd8cdb96c01c2041c5c2bc3326
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
bug:
75987238
Amit Pundir [Mon, 26 Mar 2018 15:13:33 +0000 (20:43 +0530)]
ANDROID: arm64: Image.gz-dtb build target depends on Image.gz
While doing parallel builds using "make -j" option, I ran into
a build race condition a few times where-in Image.gz-dtb target
starts building before Image.gz is even ready, resulting in a
corrupt Image.gz-dtb kernel image.
How to reproduce -->
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-androidkernel- defconfig menuconfig
+
CONFIG_BUILD_ARM64_APPENDED_DTB_IMAGE=y
CONFIG_BUILD_ARM64_APPENDED_DTB_IMAGE_NAMES="qcom/apq8096-db820c"
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-androidkernel- -j9
<snip> ..
SYSMAP System.map
OBJCOPY arch/arm64/boot/Image
GZIP arch/arm64/boot/Image.gz
DTC arch/arm64/boot/dts/qcom/apq8096-db820c.dtb
Building modules, stage 2.
CAT arch/arm64/boot/Image.gz-dtb
GZIP arch/arm64/boot/Image.gz
.. <snip>
$ du -sh arch/arm64/boot/Image.gz-dtb
28K arch/arm64/boot/Image.gz-dtb
When built with this patch -->
$ du -sh arch/arm64/boot/Image.gz-dtb
8.9M arch/arm64/boot/Image.gz-dtb
Let's make Image.gz-dtb build target depend on Image.gz explicitly.
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Daniel Rosenberg [Fri, 16 Mar 2018 02:37:16 +0000 (19:37 -0700)]
ANDROID: sdcardfs: fix lock issue on 32 bit/SMP architectures
Fixes:
cc668ff4b6a1 ("ANDROID: sdcardfs: Hold i_mutex for i_size_write")
Change-Id: If7f2ed90f59c552b9ef9262b0f6aaed394f68784
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
73287721
Dmitry Shmidt [Wed, 7 Mar 2018 22:11:03 +0000 (14:11 -0800)]
ANDROID: uid_sys_stats: Copy task_struct comm field to bigger buffer
get_task_comm() currently checks if buf_size != TASK_COMM_LEN
and fails even if sizeof(buf) > TASK_COMM_LEN.
Change-Id: Icb3e9c172607534ef1db10baf5d626083db73498
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
Ritesh Harjani [Mon, 4 Dec 2017 04:21:07 +0000 (09:51 +0530)]
ANDROID: sdcardfs: Set num in extension_details during make_item
Without this patch when you delete an extension from configfs
it still exists in the hash table data structures and we are
unable to delete it or change it's group.
This happens because during deletion the key & value is taken from
extension_details, and was not properly set.
Fix it by this patch.
Change-Id: I7c20cb1ab4d99e6aceadcb5ef850f0bb47f18be8
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
73055997
Daniel Rosenberg [Wed, 21 Feb 2018 04:25:45 +0000 (20:25 -0800)]
ANDROID: sdcardfs: Hold i_mutex for i_size_write
When we call i_size_write, we must be holding i_mutex to avoid
possible lockups on 32 bit/SMP architectures. This is not
necessary on 64 bit architectures.
Change-Id: Ic3b946507c54d81b5c9046f9b57d25d4b0f9feef
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
73287721
Daniel Rosenberg [Fri, 2 Feb 2018 00:52:22 +0000 (16:52 -0800)]
ANDROID: sdcardfs: Protect set_top
If the top is changed while we're attempting to use it, it's
possible that the reference will be put while we are in the
process of grabbing a reference.
Now we grab a spinlock to protect grabbing our reference count.
Additionally, we now set the inode_info's top value to point to
it's own data when initializing, which makes tracking changes
easier.
Change-Id: If15748c786ce4c0480ab8c5051a92523aff284d2
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Daniel Rosenberg [Tue, 23 Jan 2018 23:02:50 +0000 (15:02 -0800)]
ANDROID: fsnotify: Notify lower fs of open
If the filesystem being watched supports d_canonical_path,
notify the lower filesystem of the open as well.
Change-Id: I2b1739e068afbaf5eb39950516072bff8345ebfe
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
70706497
Daniel Rosenberg [Tue, 30 Jan 2018 05:31:21 +0000 (21:31 -0800)]
ANDROID: sdcardfs: Use lower getattr times/size
We now use the lower filesystem's getattr for time and size related
information.
Change-Id: I3dd05614a0c2837a13eeb033444fbdf070ddce2a
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
72007585
Daniel Rosenberg [Tue, 30 Jan 2018 22:24:02 +0000 (14:24 -0800)]
ANDROID: Revert "fs: unexport vfs_read and vfs_write"
This reverts commit
bd8df82be66698042d11e7919e244c8d72b042ca.
These functions are used in sdcardfs
Change-Id: If740718cca903c211d1c2832c7fd95a4c4cfe20f
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Daniel Rosenberg [Thu, 25 Jan 2018 01:25:05 +0000 (17:25 -0800)]
ANDROID: sdcardfs: port to 4.14
Change-Id: I03271d8e8229ce6f22f337dc7d1938e0bf060f2a
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
70278506
Guenter Roeck [Mon, 30 Jan 2017 20:29:00 +0000 (12:29 -0800)]
ANDROID: fs: Export vfs_rmdir2
allmodconfig builds fail with
ERROR: "vfs_rmdir2" undefined!
Export the missing function.
Change-Id: I983d327e59fd34e0484f3c54d925e97d3905c19c
Fixes:
f9cb61dcb00c ("ANDROID: sdcardfs: User new permission2 functions")
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Guenter Roeck [Thu, 24 Mar 2016 17:39:14 +0000 (10:39 -0700)]
ANDROID: mm: Export do_munmap
The 0-day build bot reports the following build error, seen if SDCARD_FS
is built as module.
ERROR: "do_munmap" undefined!
Fixes:
84a1b7d3d312 ("Included sdcardfs source code for kernel 3.0")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Guenter Roeck [Thu, 24 Mar 2016 17:32:35 +0000 (10:32 -0700)]
ANDROID: fs: Export d_absolute_path
The 0-day build bot reports the following build error, seen if SDCARD_FS
is built as module.
ERROR: "d_absolute_path" undefined!
Fixes:
84a1b7d3d312 ("Included sdcardfs source code for kernel 3.0")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Guenter Roeck [Mon, 30 Jan 2017 20:26:08 +0000 (12:26 -0800)]
ANDROID: fs: Export free_fs_struct and set_fs_pwd
allmodconfig builds fail with:
ERROR: "free_fs_struct" undefined!
ERROR: "set_fs_pwd" undefined!
Export the missing symbols.
Change-Id: I4877ead19d7e7f0c93d4c4cad5681364284323aa
Fixes:
0ec03f845799 ("ANDROID: sdcardfs: override umask on mkdir and create")
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Daniel Rosenberg [Fri, 10 Feb 2017 03:38:57 +0000 (19:38 -0800)]
ANDROID: export security_path_chown
Signed-off-by: Daniel Rosenberg <drosen@google.com>
BUG:
35142419
Change-Id: I05a9430a3c1bc624e019055175ad377290b4e774
Daniel Rosenberg [Tue, 2 Jan 2018 22:44:49 +0000 (14:44 -0800)]
ANDROID: sdcardfs: Add default_normal option
The default_normal option causes mounts with the gid set to
AID_SDCARD_RW to have user specific gids, as in the normal case.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I9619b8ac55f41415df943484dc8db1ea986cef6f
Bug:
64672411
Daniel Rosenberg [Thu, 20 Jul 2017 00:25:07 +0000 (17:25 -0700)]
ANDROID: Sdcardfs: Move gid derivation under flag
This moves the code to adjust the gid/uid of lower filesystem
files under the mount flag derive_gid.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I44eaad4ef67c7fcfda3b6ea3502afab94442610c
Bug:
63245673
Jaegeuk Kim [Fri, 7 Jul 2017 02:12:22 +0000 (19:12 -0700)]
ANDROID: sdcardfs: override credential for ioctl to lower fs
Otherwise, lower_fs->ioctl() fails due to inode_owner_or_capable().
Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
Bug:
63260873
Change-Id: I623a6c7c5f8a3cbd7ec73ef89e18ddb093c43805
Daniel Rosenberg [Thu, 20 Jul 2017 00:16:43 +0000 (17:16 -0700)]
ANDROID: sdcardfs: Remove unnecessary lock
The mmap_sem lock does not appear to be protecting
anything, and has been removed in Samsung's more
recent versions of sdcardfs.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I76ff3e33002716b8384fc8be368028ed63dffe4e
Bug:
63785372
Gao Xiang [Tue, 9 May 2017 04:30:56 +0000 (12:30 +0800)]
ANDROID: sdcardfs: use mount_nodev and fix a issue in sdcardfs_kill_sb
Use the VFS mount_nodev instead of customized mount_nodev_with_options
and fix generic_shutdown_super to kill_anon_super because of set_anon_super
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Change-Id: Ibe46647aa2ce49d79291aa9d0295e9625cfccd80
Greg Hackmann [Tue, 16 May 2017 23:48:49 +0000 (16:48 -0700)]
ANDROID: sdcardfs: remove dead function open_flags_to_access_mode()
smatch warns about the suspicious formatting in the last line of
open_flags_to_access_mode(). It turns out the only caller was deleted
over a year ago by "ANDROID: sdcardfs: Bring up to date with Android M
permissions:", so we can "fix" the function's formatting by deleting it.
Change-Id: Id85946f3eb01722eef35b1815f405a6fda3aa4ff
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Daniel Rosenberg [Wed, 7 Jun 2017 19:44:50 +0000 (12:44 -0700)]
ANDROID: sdcardfs: d_splice_alias can return error values
We must check that d_splice_alias was successful before using its
output.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
62390017
Change-Id: Ifda0a052fb3f67e35c635a4e5e907876c5400978
Daniel Rosenberg [Mon, 22 May 2017 20:23:56 +0000 (13:23 -0700)]
ANDROID: sdcardfs: Check for NULL in revalidate
If the inode is in the process of being evicted,
the top value may be NULL.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
38502532
Change-Id: I0b9d04aab621e0398d44d1c5dc53293106aa5f89
Daniel Rosenberg [Mon, 15 May 2017 21:03:15 +0000 (14:03 -0700)]
ANDROID: sdcardfs: Move top to its own struct
Move top, and the associated data, to its own struct.
This way, we can properly track refcounts on top
without interfering with the inode's accounting.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
38045152
Change-Id: I1968e480d966c3f234800b72e43670ca11e1d3fd
Gao Xiang [Wed, 10 May 2017 15:01:15 +0000 (23:01 +0800)]
ANDROID: sdcardfs: fix sdcardfs_destroy_inode for the inode RCU approach
According to the following commits,
fs: icache RCU free inodes
vfs: fix the stupidity with i_dentry in inode destructors
sdcardfs_destroy_inode should be fixed for the fast path safety.
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Change-Id: I84f43c599209d23737c7e28b499dd121cb43636d
Daniel Roseberg [Tue, 9 May 2017 20:36:35 +0000 (13:36 -0700)]
ANDROID: sdcardfs: Don't iput if we didn't igrab
If we fail to get top, top is either NULL, or igrab found
that we're in the process of freeing that inode, and did
not grab it. Either way, we didn't grab it, and have no
business putting it.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
38117720
Change-Id: Ie2f587483b9abb5144263156a443e89bc69b767b
Daniel Rosenberg [Tue, 25 Apr 2017 02:49:02 +0000 (19:49 -0700)]
ANDROID: sdcardfs: Call lower fs's revalidate
We should be calling the lower filesystem's revalidate
inside of sdcardfs's revalidate, as wrapfs does.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
35766959
Change-Id: I939d1c4192fafc1e21678aeab43fe3d588b8e2f4
Daniel Rosenberg [Mon, 24 Apr 2017 23:11:03 +0000 (16:11 -0700)]
ANDROID: sdcardfs: Avoid setting GIDs outside of valid ranges
When setting up the ownership of files on the lower filesystem,
ensure that these values are in reasonable ranges for apps. If
they aren't, default to AID_MEDIA_RW
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
37516160
Change-Id: I0bec76a61ac72aff0b993ab1ad04be8382178a00
Daniel Rosenberg [Mon, 24 Apr 2017 23:10:21 +0000 (16:10 -0700)]
ANDROID: sdcardfs: Copy meta-data from lower inode
From wrapfs commit
3ee9b365e38c ("Wrapfs: properly copy meta-data after
AIO operations from lower inode")
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
35766959
Change-Id: I9a789222e27a17b8d85ce61c45397d1839f9a675
Daniel Rosenberg [Fri, 21 Apr 2017 01:05:02 +0000 (18:05 -0700)]
ANDROID: sdcardfs: Use filesystem specific hash
We weren't accounting for FS specific hash functions,
causing us to miss negative dentries for any FS that
had one.
Similar to a patch from esdfs
commit
75bd25a9476d ("esdfs: support lower's own hash")
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I32d1ba304d728e0ca2648cacfb4c2e441ae63608
Daniel Rosenberg [Wed, 19 Apr 2017 05:49:38 +0000 (22:49 -0700)]
ANDROID: sdcardfs: Don't complain in fixup_lower_ownership
Not all filesystems support changing the owner of a file.
We shouldn't complain if it doesn't happen.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
37488099
Change-Id: I403e44ab7230f176e6df82f6adb4e5c82ce57f33