|
1 |
|
2 :mod:`audioop` --- Manipulate raw audio data |
|
3 ============================================ |
|
4 |
|
5 .. module:: audioop |
|
6 :synopsis: Manipulate raw audio data. |
|
7 |
|
8 |
|
9 The :mod:`audioop` module contains some useful operations on sound fragments. |
|
10 It operates on sound fragments consisting of signed integer samples 8, 16 or 32 |
|
11 bits wide, stored in Python strings. This is the same format as used by the |
|
12 :mod:`al` and :mod:`sunaudiodev` modules. All scalar items are integers, unless |
|
13 specified otherwise. |
|
14 |
|
15 .. index:: |
|
16 single: Intel/DVI ADPCM |
|
17 single: ADPCM, Intel/DVI |
|
18 single: a-LAW |
|
19 single: u-LAW |
|
20 |
|
21 This module provides support for a-LAW, u-LAW and Intel/DVI ADPCM encodings. |
|
22 |
|
23 .. This para is mostly here to provide an excuse for the index entries... |
|
24 |
|
25 A few of the more complicated operations only take 16-bit samples, otherwise the |
|
26 sample size (in bytes) is always a parameter of the operation. |
|
27 |
|
28 The module defines the following variables and functions: |
|
29 |
|
30 |
|
31 .. exception:: error |
|
32 |
|
33 This exception is raised on all errors, such as unknown number of bytes per |
|
34 sample, etc. |
|
35 |
|
36 |
|
37 .. function:: add(fragment1, fragment2, width) |
|
38 |
|
39 Return a fragment which is the addition of the two samples passed as parameters. |
|
40 *width* is the sample width in bytes, either ``1``, ``2`` or ``4``. Both |
|
41 fragments should have the same length. |
|
42 |
|
43 |
|
44 .. function:: adpcm2lin(adpcmfragment, width, state) |
|
45 |
|
46 Decode an Intel/DVI ADPCM coded fragment to a linear fragment. See the |
|
47 description of :func:`lin2adpcm` for details on ADPCM coding. Return a tuple |
|
48 ``(sample, newstate)`` where the sample has the width specified in *width*. |
|
49 |
|
50 |
|
51 .. function:: alaw2lin(fragment, width) |
|
52 |
|
53 Convert sound fragments in a-LAW encoding to linearly encoded sound fragments. |
|
54 a-LAW encoding always uses 8 bits samples, so *width* refers only to the sample |
|
55 width of the output fragment here. |
|
56 |
|
57 .. versionadded:: 2.5 |
|
58 |
|
59 |
|
60 .. function:: avg(fragment, width) |
|
61 |
|
62 Return the average over all samples in the fragment. |
|
63 |
|
64 |
|
65 .. function:: avgpp(fragment, width) |
|
66 |
|
67 Return the average peak-peak value over all samples in the fragment. No |
|
68 filtering is done, so the usefulness of this routine is questionable. |
|
69 |
|
70 |
|
71 .. function:: bias(fragment, width, bias) |
|
72 |
|
73 Return a fragment that is the original fragment with a bias added to each |
|
74 sample. |
|
75 |
|
76 |
|
77 .. function:: cross(fragment, width) |
|
78 |
|
79 Return the number of zero crossings in the fragment passed as an argument. |
|
80 |
|
81 |
|
82 .. function:: findfactor(fragment, reference) |
|
83 |
|
84 Return a factor *F* such that ``rms(add(fragment, mul(reference, -F)))`` is |
|
85 minimal, i.e., return the factor with which you should multiply *reference* to |
|
86 make it match as well as possible to *fragment*. The fragments should both |
|
87 contain 2-byte samples. |
|
88 |
|
89 The time taken by this routine is proportional to ``len(fragment)``. |
|
90 |
|
91 |
|
92 .. function:: findfit(fragment, reference) |
|
93 |
|
94 Try to match *reference* as well as possible to a portion of *fragment* (which |
|
95 should be the longer fragment). This is (conceptually) done by taking slices |
|
96 out of *fragment*, using :func:`findfactor` to compute the best match, and |
|
97 minimizing the result. The fragments should both contain 2-byte samples. |
|
98 Return a tuple ``(offset, factor)`` where *offset* is the (integer) offset into |
|
99 *fragment* where the optimal match started and *factor* is the (floating-point) |
|
100 factor as per :func:`findfactor`. |
|
101 |
|
102 |
|
103 .. function:: findmax(fragment, length) |
|
104 |
|
105 Search *fragment* for a slice of length *length* samples (not bytes!) with |
|
106 maximum energy, i.e., return *i* for which ``rms(fragment[i*2:(i+length)*2])`` |
|
107 is maximal. The fragments should both contain 2-byte samples. |
|
108 |
|
109 The routine takes time proportional to ``len(fragment)``. |
|
110 |
|
111 |
|
112 .. function:: getsample(fragment, width, index) |
|
113 |
|
114 Return the value of sample *index* from the fragment. |
|
115 |
|
116 |
|
117 .. function:: lin2adpcm(fragment, width, state) |
|
118 |
|
119 Convert samples to 4 bit Intel/DVI ADPCM encoding. ADPCM coding is an adaptive |
|
120 coding scheme, whereby each 4 bit number is the difference between one sample |
|
121 and the next, divided by a (varying) step. The Intel/DVI ADPCM algorithm has |
|
122 been selected for use by the IMA, so it may well become a standard. |
|
123 |
|
124 *state* is a tuple containing the state of the coder. The coder returns a tuple |
|
125 ``(adpcmfrag, newstate)``, and the *newstate* should be passed to the next call |
|
126 of :func:`lin2adpcm`. In the initial call, ``None`` can be passed as the state. |
|
127 *adpcmfrag* is the ADPCM coded fragment packed 2 4-bit values per byte. |
|
128 |
|
129 |
|
130 .. function:: lin2alaw(fragment, width) |
|
131 |
|
132 Convert samples in the audio fragment to a-LAW encoding and return this as a |
|
133 Python string. a-LAW is an audio encoding format whereby you get a dynamic |
|
134 range of about 13 bits using only 8 bit samples. It is used by the Sun audio |
|
135 hardware, among others. |
|
136 |
|
137 .. versionadded:: 2.5 |
|
138 |
|
139 |
|
140 .. function:: lin2lin(fragment, width, newwidth) |
|
141 |
|
142 Convert samples between 1-, 2- and 4-byte formats. |
|
143 |
|
144 .. note:: |
|
145 |
|
146 In some audio formats, such as .WAV files, 16 and 32 bit samples are |
|
147 signed, but 8 bit samples are unsigned. So when converting to 8 bit wide |
|
148 samples for these formats, you need to also add 128 to the result:: |
|
149 |
|
150 new_frames = audioop.lin2lin(frames, old_width, 1) |
|
151 new_frames = audioop.bias(new_frames, 1, 128) |
|
152 |
|
153 The same, in reverse, has to be applied when converting from 8 to 16 or 32 |
|
154 bit width samples. |
|
155 |
|
156 |
|
157 .. function:: lin2ulaw(fragment, width) |
|
158 |
|
159 Convert samples in the audio fragment to u-LAW encoding and return this as a |
|
160 Python string. u-LAW is an audio encoding format whereby you get a dynamic |
|
161 range of about 14 bits using only 8 bit samples. It is used by the Sun audio |
|
162 hardware, among others. |
|
163 |
|
164 |
|
165 .. function:: minmax(fragment, width) |
|
166 |
|
167 Return a tuple consisting of the minimum and maximum values of all samples in |
|
168 the sound fragment. |
|
169 |
|
170 |
|
171 .. function:: max(fragment, width) |
|
172 |
|
173 Return the maximum of the *absolute value* of all samples in a fragment. |
|
174 |
|
175 |
|
176 .. function:: maxpp(fragment, width) |
|
177 |
|
178 Return the maximum peak-peak value in the sound fragment. |
|
179 |
|
180 |
|
181 .. function:: mul(fragment, width, factor) |
|
182 |
|
183 Return a fragment that has all samples in the original fragment multiplied by |
|
184 the floating-point value *factor*. Overflow is silently ignored. |
|
185 |
|
186 |
|
187 .. function:: ratecv(fragment, width, nchannels, inrate, outrate, state[, weightA[, weightB]]) |
|
188 |
|
189 Convert the frame rate of the input fragment. |
|
190 |
|
191 *state* is a tuple containing the state of the converter. The converter returns |
|
192 a tuple ``(newfragment, newstate)``, and *newstate* should be passed to the next |
|
193 call of :func:`ratecv`. The initial call should pass ``None`` as the state. |
|
194 |
|
195 The *weightA* and *weightB* arguments are parameters for a simple digital filter |
|
196 and default to ``1`` and ``0`` respectively. |
|
197 |
|
198 |
|
199 .. function:: reverse(fragment, width) |
|
200 |
|
201 Reverse the samples in a fragment and returns the modified fragment. |
|
202 |
|
203 |
|
204 .. function:: rms(fragment, width) |
|
205 |
|
206 Return the root-mean-square of the fragment, i.e. ``sqrt(sum(S_i^2)/n)``. |
|
207 |
|
208 This is a measure of the power in an audio signal. |
|
209 |
|
210 |
|
211 .. function:: tomono(fragment, width, lfactor, rfactor) |
|
212 |
|
213 Convert a stereo fragment to a mono fragment. The left channel is multiplied by |
|
214 *lfactor* and the right channel by *rfactor* before adding the two channels to |
|
215 give a mono signal. |
|
216 |
|
217 |
|
218 .. function:: tostereo(fragment, width, lfactor, rfactor) |
|
219 |
|
220 Generate a stereo fragment from a mono fragment. Each pair of samples in the |
|
221 stereo fragment are computed from the mono sample, whereby left channel samples |
|
222 are multiplied by *lfactor* and right channel samples by *rfactor*. |
|
223 |
|
224 |
|
225 .. function:: ulaw2lin(fragment, width) |
|
226 |
|
227 Convert sound fragments in u-LAW encoding to linearly encoded sound fragments. |
|
228 u-LAW encoding always uses 8 bits samples, so *width* refers only to the sample |
|
229 width of the output fragment here. |
|
230 |
|
231 Note that operations such as :func:`mul` or :func:`max` make no distinction |
|
232 between mono and stereo fragments, i.e. all samples are treated equal. If this |
|
233 is a problem the stereo fragment should be split into two mono fragments first |
|
234 and recombined later. Here is an example of how to do that:: |
|
235 |
|
236 def mul_stereo(sample, width, lfactor, rfactor): |
|
237 lsample = audioop.tomono(sample, width, 1, 0) |
|
238 rsample = audioop.tomono(sample, width, 0, 1) |
|
239 lsample = audioop.mul(sample, width, lfactor) |
|
240 rsample = audioop.mul(sample, width, rfactor) |
|
241 lsample = audioop.tostereo(lsample, width, 1, 0) |
|
242 rsample = audioop.tostereo(rsample, width, 0, 1) |
|
243 return audioop.add(lsample, rsample, width) |
|
244 |
|
245 If you use the ADPCM coder to build network packets and you want your protocol |
|
246 to be stateless (i.e. to be able to tolerate packet loss) you should not only |
|
247 transmit the data but also the state. Note that you should send the *initial* |
|
248 state (the one you passed to :func:`lin2adpcm`) along to the decoder, not the |
|
249 final state (as returned by the coder). If you want to use |
|
250 :func:`struct.struct` to store the state in binary you can code the first |
|
251 element (the predicted value) in 16 bits and the second (the delta index) in 8. |
|
252 |
|
253 The ADPCM coders have never been tried against other ADPCM coders, only against |
|
254 themselves. It could well be that I misinterpreted the standards in which case |
|
255 they will not be interoperable with the respective standards. |
|
256 |
|
257 The :func:`find\*` routines might look a bit funny at first sight. They are |
|
258 primarily meant to do echo cancellation. A reasonably fast way to do this is to |
|
259 pick the most energetic piece of the output sample, locate that in the input |
|
260 sample and subtract the whole output sample from the input sample:: |
|
261 |
|
262 def echocancel(outputdata, inputdata): |
|
263 pos = audioop.findmax(outputdata, 800) # one tenth second |
|
264 out_test = outputdata[pos*2:] |
|
265 in_test = inputdata[pos*2:] |
|
266 ipos, factor = audioop.findfit(in_test, out_test) |
|
267 # Optional (for better cancellation): |
|
268 # factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)], |
|
269 # out_test) |
|
270 prefill = '\0'*(pos+ipos)*2 |
|
271 postfill = '\0'*(len(inputdata)-len(prefill)-len(outputdata)) |
|
272 outputdata = prefill + audioop.mul(outputdata,2,-factor) + postfill |
|
273 return audioop.add(inputdata, outputdata, 2) |
|
274 |