Below is an example of what this will look like:
The above was created using kaizen.place’s Audio to Waveform Video Reel Generator. This post is a simplified version of this tool to showcase how to convert audio data into a visual waveform representation.
<!-- index.html -->
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Audio to Waveform Video</title>
<script src="index.js" defer></script>
</head>
<body>
</body>
</html>
async function run() {
const audioBuffer = await loadAndDecodeAudio("sample.mp3");
// For simplicity, only using the first channel of data
const channelData = audioBuffer.getChannelData(0);
// This tracks the maximum average seen across all chunks
let max = 0;
// How many milliseconds a chunk represents
const msPerChunk = 100;
// How many data points will be included in each chunk
const chunkSize = Math.round((audioBuffer.sampleRate * msPerChunk) / 1000);
// To get the average we need to sum up all values in the chunk
let chunkTotalValue = 0;
// As we compute chunk averages store them in this array
let chunkAverages = [];
// This primarily helps cover the final case where a chunk has fewer values
// than the chunk size
let currentChunkSize = 0;
for (let i = 0; i < audioBuffer.length; i++) {
// Channel data will be between -1 and 1
// Absolute value ensures negatives don't just cancel out positives
const value = Math.abs(channelData[i]);
currentChunkSize++;
chunkTotalValue += value;
if (i > 0 && (i % chunkSize === 0 || i === audioBuffer.length - 1)) {
const chunkAverage = chunkTotalValue / currentChunkSize;
if (chunkAverage > max) {
max = value;
}
chunkAverages.push(chunkAverage);
chunkTotalValue = 0;
currentChunkSize = 0;
}
}
// Use the max average we found to normalize the averages to be between 0 and 1
const normalizedChunkValues = chunkAverages.map((avg) => {
return avg / max;
});
// Create a canvas and add to the document to draw on
const canvas = document.createElement("canvas");
canvas.width = 720;
canvas.height = 1280;
document.body.appendChild(canvas);
const ctx = canvas.getContext("2d");
render({
canvas,
ctx,
normalizedChunkValues,
startTime: new Date().getTime(),
msPerChunk,
});
}
function render({ canvas, ctx, normalizedChunkValues, startTime, msPerChunk }) {
// The elapsedTime allows us to know how far into the audio we are
const elapsedTime = new Date().getTime() - startTime;
// Clear the entire canvas to remove any drawings from previous frame
ctx.fillStyle = "#000000";
ctx.fillRect(0, 0, canvas.width, canvas.height);
ctx.fillStyle = "#FFFFFF";
const barWidth = 4;
const barSpacing = 4;
const maxBarHeight = 200;
for (let i = 0; i < normalizedChunkValues.length; i++) {
// normalizedChunkValues will be a float 0-1 - a percentage of max amplitude
const value = normalizedChunkValues[i];
// The highest amplitude part of audio will get a bar at the max height
const barHeight = maxBarHeight * value;
// This moves the bars based on how much time has passed
const xOffset = (elapsedTime / msPerChunk) * (barWidth + barSpacing);
// Spaces out the bars
const x = i * (barWidth + barSpacing) - xOffset;
// Centers the bars to the middle of the canvas
const y = (canvas.height - barHeight) / 2;
// Draws the bar at the calculated position and size
ctx.fillRect(x, y, barWidth, barHeight);
}
// Calls this function again at the start of the next frame
// Typically this is 60fps, but will depend on the display rate of your monitor
requestAnimationFrame(render.bind(this, ...arguments));
}
// Helper function to fetch and decode from a URL
async function loadAndDecodeAudio(audioURL) {
const response = await fetch(audioURL);
const arrayBuffer = await response.arrayBuffer();
return decodeAudioData(arrayBuffer);
}
// Decodes the ArrayBuffer into an AudioBuffer
// This gives access to the raw channel data which we use to generate the waveform
// https://developer.mozilla.org/en-US/docs/Web/API/AudioBuffer
// https://developer.mozilla.org/en-US/docs/Web/API/BaseAudioContext/decodeAudioData
async function decodeAudioData(arrayBuffer) {
return new Promise((resolve, reject) => {
const audioContext = new (window.AudioContext ||
window.webkitAudioContext)();
audioContext.decodeAudioData(arrayBuffer, resolve, reject);
});
}
run();
You can use whatever audio file you like. If you don’t have one you can grab a sample one from here.
npm i -g http-server
http-server .
Head to http://localhost:8080 and you should see the canvas animating through the waveform generated from the audio file.
This creates a nice animation in the browser, but if you want to send this somewhere it is very likely you want to download this as a video file (specifically mp4). You can either follow How to Save HTML Canvas to Mp4 Using WebCodecs API 10x Faster Than Realtime in the Browser or How to Record HTML Canvas using MediaRecorder and Export as Video to take this canvas drawing and export a video file.
Alternatively, if it suits your needs you can use the kaizen.place Audio to Waveform Video Reel Generator.
]]>In working on an Audio Waveform Reel Generator for kaizen.place, I found a few resources to record a canvas using the MediaRecorder API [1][2], a method used for the initial implementation (I wrote about it here). Unfortunately, the only browser that seemed to record directly to mp4
was Safari. Chrome would record as a webm
file which Instagram doesn’t support.
We were able to bridge the gap by using ffmpeg.wasm to convert our webm file to an mp4. However, on top of a 60 second recording session, we now also had another step in the pipeline and a 30 MB download, slowing things down even more. With some careful codec selection, it was possible in some browsers to record webm with a codec that could simply be copied over to an mp4 container, which brought things to a somewhat acceptable level.
The MediaRecorder quality wasn’t great and I knew the drawing operations couldn’t be taking more than a millisecond, which meant most of the time was spent just waiting. I knew there must be a better way.
The code presented below uses the WebCodecs API and the mp4-muxer npm package to encode a video from a canvas source up to 10 times faster than realtime. A working demo of this code can be found here.
npm init
npm i mp4-muxer
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Canvas To MP4</title>
<script src="index.js"></script>
</head>
<body>
</body>
</html>
npm i esbuild
"scripts": {
"build": "esbuild src/index.ts --bundle --outfile=public/index.js"
},
import * as Mp4Muxer from "mp4-muxer";
async function run() {
const canvas = new OffscreenCanvas(720, 1280);
const ctx = canvas.getContext("2d", {
// This forces the use of a software (instead of hardware accelerated) 2D canvas
// This isn't necessary, but produces quicker results
willReadFrequently: true,
// Desynchronizes the canvas paint cycle from the event loop
// Should be less necessary with OffscreenCanvas, but with a real canvas you will want this
desynchronized: true,
});
const fps = 30;
const duration = 60;
const numFrames = duration * fps;
let muxer = new Mp4Muxer.Muxer({
target: new Mp4Muxer.ArrayBufferTarget(),
video: {
// If you change this, make sure to change the VideoEncoder codec as well
codec: "avc",
width: canvas.width,
height: canvas.height,
},
// mp4-muxer docs claim you should always use this with ArrayBufferTarget
fastStart: "in-memory",
});
let videoEncoder = new VideoEncoder({
output: (chunk, meta) => muxer.addVideoChunk(chunk, meta),
error: (e) => console.error(e),
});
// This codec should work in most browsers
// See https://dmnsgn.github.io/media-codecs for list of codecs and see if your browser supports
videoEncoder.configure({
codec: "avc1.42001f",
width: canvas.width,
height: canvas.height,
bitrate: 500_000,
bitrateMode: "constant",
});
// Loops through and draws each frame to the canvas then encodes it
for (let frameNumber = 0; frameNumber < numFrames; frameNumber++) {
drawFrameToCanvas({
ctx,
canvas,
frameNumber,
numFrames
});
renderCanvasToVideoFrameAndEncode({
canvas,
videoEncoder,
frameNumber,
fps
})
}
// Forces all pending encodes to complete
await videoEncoder.flush();
muxer.finalize();
let buffer = muxer.target.buffer;
downloadBlob(new Blob([buffer]));
}
// Animates a red box moving from top left to top right of screen
function drawFrameToCanvas({ canvas, ctx, frameNumber, numFrames }) {
ctx.fillStyle = "white";
ctx.fillRect(0, 0, canvas.width, canvas.height);
const x = (frameNumber / numFrames) * canvas.width;
ctx.fillStyle = "red";
ctx.fillRect(x, 0, 100, 100);
}
async function renderCanvasToVideoFrameAndEncode({
canvas,
videoEncoder,
frameNumber,
fps,
}) {
let frame = new VideoFrame(canvas, {
// Equally spaces frames out depending on frames per second
timestamp: (frameNumber * 1e6) / fps,
});
// The encode() method of the VideoEncoder interface asynchronously encodes a VideoFrame
videoEncoder.encode(frame);
// The close() method of the VideoFrame interface clears all states and releases the reference to the media resource.
frame.close();
}
function downloadBlob(blob) {
let url = window.URL.createObjectURL(blob);
let a = document.createElement("a");
a.style.display = "none";
a.href = url;
a.download = "animation.mp4";
document.body.appendChild(a);
a.click();
window.URL.revokeObjectURL(url);
}
run();
npm i http-server
"scripts": {
"start": "http-server public"
},
npm start
By default, should be accessible at http://127.0.0.1:8080
Ultimately, I’d like to be writing a lot more consistently and I think doing so in one place is the most likely way to successfully do this. I write so infrequently that I don’t have a good routine. There are certain things I like about this Jekyll blog, but things like adding screenshots is such a hassle that I procrastinate certain posts because I know it will be too painful. Then I convince myself that I can just build something to make this more tolerable. I already have, and it’s called “hyde”. I’m still a bit lost as to whether it’s supposed to replace jekyll or simply work in tandem with it. The name is apt because I literally feel like Dr. Jekyll getting angry and turning into Hyde.
These update posts are for those curious about what I’m currently working on and wanting a deeper look into the thoughts behind things. My very first post on adamberg.blog was about how I wanted and needed to get off of Twitter. It ended up taking me most of the year to finally make that leap. I found it very difficult to not open Twitter several times a day, despite never really enjoying the experience. I would much rather spend my time creating than consuming and so having a space that only lets me do one of those things is incredibly valuable to me.
This year kaizen.place will be a key focus for me. My good friend and co-founder, Ben, is coming on full-time and we will be learning and exploring how to make this a success for music creators and as a business. This journey will inevitably have lots of twists and turns, but I’m looking forward to the adventure.
I am again and again reminded that crafting good YouTube tutorials is where I should be focusing energy. My tutorial on Writing a Wav File in C picked up 10,000 views over the holidays and led to 233 subscribers so far. And I feel like it’s just getting started. There are so many things I want to put together, but I need to hone my focus and align goals of different projects. This most likely means continuing to focus on audio programming topics to be able to showcase the evolution of these projects on my music.tails Kaizen account.
This is probably the hardest one for me. I would like to be consistent in posting, particularly on my music.tails account. I find so many barriers and excuses for not posting here. What I really need to do is set up some kind of schedule where I block off time to make/prepare content. e.g. I have a bunch of footage from my first band performance from last year that has never seen the light of day. Going through and cutting out clips is just enough work and far enough outside my comfort zone that I continue to defer doing this.
In early February, I plan to put together a performance of some songs I have been learning to sing and play on piano. I’m hoping to lock down a date and venue this week so I can stop delaying this.
The word thrashing came up recently as a computer term mapped to the human brain. I have definitely been thrashing the last little while. I have so many things I want to do and others that I need to do and others yet that others want me to do. This year, I want to be more mindful of this and do a better job of prioritizing and probably delegating.
I’ve already spent longer on this than I would have liked, but at least I made it. Especially early on, I’d like to attempt to do this daily. Essentially like a public journal. I think this would help me narrow down focus and force some accountability on myself. If I can get it down to 15-20 minutes of writing or less should be esay enough to have part of a morning routine.
]]>git clone https://github.com/InfiniTimeOrg/InfiniTime.git
cd InfiniTime
git submodule update --init
This uses their image from Docker Hub that has all the dependencies you need to compile the source. See further explanation here
docker run --rm -it -v ${PWD}:/sources --user $(id -u):$(id -g) infinitime/infinitime-build
Official docs here
brew install openocd
source [find interface/stlink.cfg]
gdb_flash_program enable
gdb_breakpoint_override hard
source [find target/nrf52.cfg]
init
program ./build/output/pinetime-mcuboot-app-image-1.13.0.bin verify 0x00008000
reset
See screenshots here for correct orientation. Pins should be connected to SWCLK, SWDIO, GND, and 3.3V on the ST-Link V2 side of things. The wire coloring seems to be standard, so you should have:
Brown - SWCLK Red - SWDIO White - GND Black - 3.3V
When plugging pins into PineTime, orient the programmable pins so they are on the far side from you and then the red pin should go in the left most pin.
I tend to plug pins into the PineTime first before plugging in the USB portion. If you flip the watch face over and rest it on the pins it seems to hold itself up while applying pressure on the pins so you can actually see what’s happening without
openocd -f ./openocd-stlink.ocd -f ./flash_bootloader_app.ocd
This appears to be mostly smartphone driven, which is a bit confusing to imagine what kind of workflow one would use with that. I managed to install nRF Connect Mobile on my M1 Mac and started an OTA update. It took a very look time and then at 99% said there was an errror. The STLink method is so much faster and reliable that I probably won’t explore this any further.
InfiniSim seems to expect that you will be running on Linux. I imagine it should be possible to get working on Mac, but the number of hours I have fought SDL2 and CMake to work correctly, I decided to just try it on Ubuntu. Following their listed instructions on Ubuntu I was mostly able to get things going. lv_img_conv.py
gave a “SyntaxError: invalid syntax” at line 163 match args.color_format:
. I ended up just modifying the file to remove the match statement altogether and forcing one of the conditions. I assume it’s a python version issue, but didn’t have the energy to figure out what the problem was. Never ceases to amaze me how difficult reproducible builds are…
git clone --recursive https://github.com/InfiniTimeOrg/InfiniSim.git
cd InfiniSim
git submodule update --init --recursive
sudo apt install -y cmake libsdl2-dev g++ npm libpng-dev
npm install lv_font_conv@1.5.2
cmake -S . -B build
cmake --build build -j4
import fs from "fs";
import { log } from "console";
const lines = fs.readFileSync("./day3.txt", { encoding: "utf-8" }).split("\n");
function isDigit(c) {
return /[\d]/.test(c);
}
function part1() {
let sum = 0;
for (const [rowIndex, line] of lines.entries()) {
let partNumber = '';
let startColIndex = 0;
for (const [colIndex, c] of line.split('').entries()) {
if (isDigit(c)) {
if (!partNumber) {
startColIndex = colIndex;
}
partNumber += c;
}
const isEndOfPartNumber = !isDigit(c) || (colIndex === line.length - 1)
if (isEndOfPartNumber && partNumber) {
const startRowSearchIndex = Math.max(rowIndex - 1, 0);
const endRowSearchIndex = Math.min(rowIndex + 1, lines.length - 1);
const startColSearchIndex = Math.max(startColIndex - 1, 0);
const endColSearchIndex = Math.min(colIndex, line.length - 1)
let symbolFound = false;
for (let i = startRowSearchIndex; i <= endRowSearchIndex; i++) {
for (let j = startColSearchIndex; j <= endColSearchIndex; j++) {
if (/[^\d\.]/.test(lines[i][j])) {
symbolFound = true;
}
}
}
if (symbolFound) {
sum += Number(partNumber);
}
partNumber = '';
}
}
}
log(sum)
}
function part2() {
let sum = 0;
const gearMap = {};
for (const [rowIndex, line] of lines.entries()) {
let partNumber = '';
let startColIndex = 0;
for (const [colIndex, c] of line.split('').entries()) {
if (isDigit(c)) {
if (!partNumber) {
startColIndex = colIndex;
}
partNumber += c;
}
const isEndOfPartNumber = !isDigit(c) || (colIndex === line.length - 1)
if (isEndOfPartNumber && partNumber) {
const startRowSearchIndex = Math.max(rowIndex - 1, 0);
const endRowSearchIndex = Math.min(rowIndex + 1, lines.length - 1);
const startColSearchIndex = Math.max(startColIndex - 1, 0);
const endColSearchIndex = Math.min(colIndex, line.length - 1)
for (let i = startRowSearchIndex; i <= endRowSearchIndex; i++) {
for (let j = startColSearchIndex; j <= endColSearchIndex; j++) {
if (/\*/.test(lines[i][j])) {
const key = `${i}-${j}`;
if (gearMap[key]) {
sum += gearMap[key] * Number(partNumber);
}
gearMap[key] = Number(partNumber);
}
}
}
partNumber = '';
}
}
}
log(sum)
}
part1();
part2();
import fs from "fs";
import { log } from "console";
const lines = fs.readFileSync("./day2.txt", { encoding: "utf-8" }).split("\n");
function part1() {
const maxDict = {
red: 12,
green: 13,
blue: 14,
};
let sum = 0;
let gameId = 1;
for (const line of lines) {
let passes = true;
const sets = line.replace(/Game [\d]*:\s/g, "").split(";");
for (const set of sets) {
const colorWithCount = set.split(",");
for (const colorCount of colorWithCount) {
const [count, color] = colorCount.trim().split(" ");
if (count > maxDict[color]) {
passes = false;
}
}
}
if (passes) {
sum += gameId;
}
gameId++;
}
log(sum);
}
function part2() {
let sum = 0;
for (const line of lines) {
const sets = line.replace(/Game [\d]*:\s/g, "").split(";");
const gameMaxMap = {};
for (const set of sets) {
const colorWithCount = set.split(",");
for (const colorCount of colorWithCount) {
const [count, color] = colorCount.trim().split(" ");
const countAsNum = Number(count);
const currentMax = gameMaxMap[color] || 0;
if (countAsNum > currentMax) {
gameMaxMap[color] = countAsNum;
}
}
}
const red = Number(gameMaxMap["red"]) || 0;
const green = Number(gameMaxMap["green"]) || 0;
const blue = Number(gameMaxMap["blue"]) || 0;
const power = red * green * blue;
sum += power;
}
log(sum);
}
part1();
part2();
Solution for Advent of Code (AoC) 2023 Day 1: Trebuchet?! in JavaScript:
import fs from "fs";
import { log } from "console";
const lines = fs.readFileSync("./day1.txt", {encoding: "utf-8"}).split("\n");
function isDigit(char) {
return /^\d$/.test(char);
}
function part1() {
let sum = 0;
for (const line of lines) {
let firstNum, lastNum;
for (const c of line) {
if (isDigit(c)) {
if (!firstNum) {
firstNum = c;
}
lastNum = c;
}
}
const twoDigit = `${firstNum}${lastNum}`;
sum += Number(twoDigit);
}
log(sum);
}
function part2() {
let sum = 0;
for(const line of lines) {
let digits = [];
for (let i = 0; i < line.length; i++) {
const c = line[i];
if (isDigit(c)) {
digits.push(c);
}
const textDigits = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine'];
const lineSubstring = line.substring(i);
for (let d = 0; d < textDigits.length; d++) {
const textDigit = textDigits[d];
if (lineSubstring.startsWith(textDigit)) {
digits.push(d+1);
}
}
}
const lastIndex = digits.length - 1;
const twoDigits = `${digits[0]}${digits[lastIndex]}`;
sum += Number(twoDigits);
}
log(sum);
}
part1();
part2();
This is part of a project I have started to see if I can create a full song using just code. I started out writing in C, but for creating more interactive demos I decided to port over to JavaScript. I had never heard of the DataView class in JavaScript. This will ultimately be a lot easier to work with and easier for others to grok as well.
Head here for a breakdown of the WAV File Format
const sampleRate = 8000;
const durationSeconds = 10;
const numChannels = 1;
const bytesPerSample = 2 * numChannels;
const bytesPerSecond = sampleRate * bytesPerSample;
const dataLength = bytesPerSecond * durationSeconds;
const headerLength = 44;
const fileLength = dataLength + headerLength;
const bufferData = new Uint8Array(fileLength);
const dataView = new DataView(bufferData.buffer);
const writer = createWriter(dataView);
// HEADER
writer.string("RIFF");
// File Size
writer.uint32(fileLength);
writer.string("WAVE");
writer.string("fmt ");
// Chunk Size
writer.uint32(16);
// Format Tag
writer.uint16(1);
// Number Channels
writer.uint16(numChannels);
// Sample Rate
writer.uint32(sampleRate);
// Bytes Per Second
writer.uint32(bytesPerSecond);
// Bytes Per Sample
writer.uint16(bytesPerSample);
// Bits Per Sample
writer.uint16(bytesPerSample * 8);
writer.string("data");
writer.uint32(dataLength);
for (let i = 0; i < dataLength / 2; i++) {
const t = i / sampleRate;
const frequency = 256;
const volume = 0.6;
const val = Math.sin(2 * Math.PI * 256 * t) * volume;
writer.pcm16s(val);
}
const blob = new Blob([dataView.buffer], { type: 'application/octet-stream' });
audioPlayer.src = URL.createObjectURL(blob);
function createWriter(dataView) {
let pos = 0;
return {
string(val) {
for (let i = 0; i < val.length; i++) {
dataView.setUint8(pos++, val.charCodeAt(i));
}
},
uint16(val) {
dataView.setUint16(pos, val, true);
pos += 2;
},
uint32(val) {
dataView.setUint32(pos, val, true);
pos += 4;
},
pcm16s: function(value) {
value = Math.round(value * 32768);
value = Math.max(-32768, Math.min(value, 32767));
dataView.setInt16(pos, value, true);
pos += 2;
},
}
}
This is part of a project I have started to see if I can create a full song using just code.
Head here for a breakdown of the WAV File Format
struct wav_header
{
char riff[4]; /* "RIFF" */
int32_t flength; /* file length in bytes */
char wave[4]; /* "WAVE" */
char fmt[4]; /* "fmt " */
int32_t chunk_size; /* size of FMT chunk in bytes (usually 16) */
int16_t format_tag; /* 1=PCM, 257=Mu-Law, 258=A-Law, 259=ADPCM */
int16_t num_chans; /* 1=mono, 2=stereo */
int32_t srate; /* Sampling rate in samples per second */
int32_t bytes_per_sec; /* bytes per second = srate*bytes_per_samp */
int16_t bytes_per_samp; /* 2=16-bit mono, 4=16-bit stereo */
int16_t bits_per_samp; /* Number of bits per sample */
char data[4]; /* "data" */
int32_t dlength; /* data length in bytes (filelength - 44) */
};
// main.c
#include <string.h>
struct wav_header wavh;
const int sample_rate = 8000;
const int duration_seconds = 10;
const int buffer_size = sample_rate * duration_seconds;
short int buffer[buffer_size] = {};
int main(void)
{
strncpy(wavh.riff, "RIFF", 4);
strncpy(wavh.wave, "WAVE", 4);
strncpy(wavh.fmt, "fmt ", 4);
strncpy(wavh.data, "data", 4);
wavh.chunk_size = 16;
wavh.format_tag = 1;
wavh.num_chans = 1;
wavh.srate = sample_rate;
wavh.bits_per_samp = 16;
wavh.bytes_per_sec = wavh.srate * wavh.bits_per_samp / 8 * wavh.num_chans;
wavh.bytes_per_samp = wavh.bits_per_samp / 8 * wavh.num_chans;
wavh.dlength = buffer_size * wavh.bytes_per_samp;
wavh.flength = wavh.dlength + header_length;
}
#include <stdio.h>
FILE *fp = fopen("test.wav", "w");
fwrite(&wavh, 1, header_length, fp);
fwrite(buffer, 2, buffer_size, fp);
#include <math.h>
for (int i = 0; i < buffer_size; i++) {
buffer[i] = (short int)((cos((2 * M_PI * MIDDLE_C * i) / sample_rate) * 1000));
}
// main.c
#include <string.h>
#include <stdio.h>
#include <math.h>
struct wav_header
{
char riff[4]; /* "RIFF" */
int32_t flength; /* file length in bytes */
char wave[4]; /* "WAVE" */
char fmt[4]; /* "fmt " */
int32_t chunk_size; /* size of FMT chunk in bytes (usually 16) */
int16_t format_tag; /* 1=PCM, 257=Mu-Law, 258=A-Law, 259=ADPCM */
int16_t num_chans; /* 1=mono, 2=stereo */
int32_t srate; /* Sampling rate in samples per second */
int32_t bytes_per_sec; /* bytes per second = srate*bytes_per_samp */
int16_t bytes_per_samp; /* 2=16-bit mono, 4=16-bit stereo */
int16_t bits_per_samp; /* Number of bits per sample */
char data[4]; /* "data" */
int32_t dlength; /* data length in bytes (filelength - 44) */
};
struct wav_header wavh;
const float MIDDLE_C = 256.00;
const int sample_rate = 8000;
const int duration_seconds = 10;
const int buffer_size = sample_rate * duration_seconds;
short int buffer[buffer_size] = {};
const int header_length = sizeof(struct wav_header);
int main(void)
{
strncpy(wavh.riff, "RIFF", 4);
strncpy(wavh.wave, "WAVE", 4);
strncpy(wavh.fmt, "fmt ", 4);
strncpy(wavh.data, "data", 4);
wavh.chunk_size = 16;
wavh.format_tag = 1;
wavh.num_chans = 1;
wavh.srate = sample_rate;
wavh.bits_per_samp = 16;
wavh.bytes_per_sec = wavh.srate * wavh.bits_per_samp / 8 * wavh.num_chans;
wavh.bytes_per_samp = wavh.bits_per_samp / 8 * wavh.num_chans;
for (int i = 0; i < buffer_size; i++) {
buffer[i] = (short int)((cos((2 * M_PI * MIDDLE_C * i) / sample_rate) * 1000));
}
wavh.dlength = buffer_size * wavh.bytes_per_samp;
wavh.flength = wavh.dlength + header_length;
FILE *fp = fopen("test.wav", "w");
fwrite(&wavh, 1, header_length, fp);
fwrite(buffer, 2, buffer_size, fp);
}
A repository seems like an easy place to start. Even in software, a “repository” could just as easily be called a project. In music lingo, this is probably most analogous to an album. The goal of the repository is to be a central place where everything related to that project is stored.
On Kaizen, a “Project” is currently the closest thing to a repository right now.
As a software developer, a commit is the next most obvious thing that comes to mind when I think of Git/Github. Commits are little checkpoints made on the way to some kind of larger progress.
I can see the value of bringing something like this to the music creation process, but in reality it feels cumbersome. The music process seems less linear. With code, I know I have made progress towards my ultimately goal (cough usually…), but with music you might try something one direction and then go a different direction.
Much like with code, I think it’s more of a personal preference what you define as a commit. Say you try something with your track and want to remember what it sounds like, that might indicate a good time to take a snapshot.
It’s not perfect, but with Kaizen currently you can capture a version at these “checkpoints”. Right now, these would be bounced to a single mp3, meaning you couldn’t really go back to the state of things in your Digital Audio Workstation (DAW). I’m unclear how much value there would be to bringing this kind of granualrity to things. As long as you are still working within the same session, you should be able to undo/redo yourself back to where you’d like to be.
I would argue that the “push” is currently where Kaizen enters the picture. When coding, I typically push when I have made some tangible progress or am ready to get eyes on what I’ve been working on. For a musician, this would be uploading a new version of their track to Kaizen.
Currently uploading a track and releasing it are nearly synonymous. I could see this changing in the future to better identify something as a “release” vs. just an update. In software, you could make several changes to your project over weeks, but only once they were all in together would you compile that into a release. I think the same would be mostly true for music.
The closest thing to a pull right now would be just downloading a version of a song. Unfortunately, right now this wouldn’t give you access to things like the stems, or the workspace that the music was created in. I’m still undecided as to whether this should be solved and, if so, how. One possible option could be only storing a single audio file for each version, but then using AI to extract the stems if someone is looking to use them in some different way.
With code, branches make sense because it is reasonably feasible to manage merging text changes that have drifted apart. I think something like branches would look quite different. Instead of something like “feature” branches, it might look more like a branch to explore a possible song direction. If this were true, branches would likely never get merged back in to some main branch. Instead whatever branch was the preferred direction would be the one that is followed.
An example of this could be a “sad piano” branch and an “upbeat acoustic” branch. While the current status quo says that you need to pick one for your final release, I think the freedom to be able to simply have both wins. You might like one better than the other, but fans might have their own preferences.
This is one of the more interesting concepts when mapped to music. Forking a song could clarify where inspiration came from. When forking software, it’s likely you are using most of the existing code that you are forking, I think this is less true of music. You might fork for only a small sample or stem of a piece of music. But I think the ability to give credit and show the ancestry of the music is special.
I found it fascinating when I learned that the Postal Service came up with their name because they were essentially sending songs back and forth and incrementally updating them. This seems like it would be quite common when working with a group of people rather than just yourself. It would be great to be able to have Projects with multiple collaborators assigned that can each have access to upload new versions. With both getting credit for being a part of the composition.
There are still many more thoughts swirling for me on this topic, but I think this is a good place to stop. A larger topic that I didn’t get into is diffing audio files. I will likely make that a subject of my next post and likely use it as an opportunity to put some kind of demo together for it.
]]>