لاستخدام نموذج الـ DeepHit لازم احول عمود الـ efs_time الي فئات ؟

Ali Ahmed55 · 4 فبراير

السلام عليكم

هو عشان استخدم نموذج الDeepHit الازم احول عمود الefs_time الي فئات ؟

او الازم ان تكون المكشله مشكله تصنيف مش هينفع يكون تنبوء ؟

محمد عاطف17 · 4 فبراير

وعليكم السلام ورحمة الله وبركاته.

لا . لا تحتاج إلى تحويل efs_time إلى فئات. حيث يعتمد النموذج على البيانات الزمنية المستمرة للتنبؤ باحتمالية حدوث الحدث في أوقات مختلفة. مثل بيانات وقت وقوع الحدث ( وفاة مريض أو فشل جهاز معين).

حيث يستخدم النوذج الوقت كمتغير مستمر ويعتمد على شبكة عصبية لتوقع الحدث بناء على تسلسل زمني. وهكذا يمكن للنموذج التعامل مع البيانات الزمنية بشكل مباشر. دون الحاجة لتحويلها إلى فئات.

أما بخصوص السؤال الثاني فأيضا لا . حيث أن النموذج ليس نموذج تصنيف بل هو يتحل مشكلة البقاء (Survival Analysis)، وهي نوع من المشاكل التي تتضمن التنبؤ بوقت حدوث حدث معين (مثل الوفاة أو الفشل) وليس مجرد تصنيف بين حدث أو لم يحدث.

Ali Ahmed55 · 4 فبراير

بتاريخ 2 دقائق مضت قال محمد عاطف17:

وعليكم السلام ورحمة الله وبركاته.

لا . لا تحتاج إلى تحويل efs_time إلى فئات. حيث يعتمد النموذج على البيانات الزمنية المستمرة للتنبؤ باحتمالية حدوث الحدث في أوقات مختلفة. مثل بيانات وقت وقوع الحدث ( وفاة مريض أو فشل جهاز معين).

حيث يستخدم النوذج الوقت كمتغير مستمر ويعتمد على شبكة عصبية لتوقع الحدث بناء على تسلسل زمني. وهكذا يمكن للنموذج التعامل مع البيانات الزمنية بشكل مباشر. دون الحاجة لتحويلها إلى فئات.

تمام الحمد الله

الان انا اصلان مطلوب مني تنبوء مش تصتيف

جزاالله كل خير

Chihab Hedidi · 4 فبراير

النموذج لا يتطلب تحويل عمود efs_time إلى فئات، بل يعمل النموذج مباشرة على البيانات الزمنية المستمرة، و تحويل الزمن إلى فئات قد يؤدي إلى فقدان المعلومات ويقلل من دقة النموذج، و مشكلة تحليل البقاء ليست مشكلة تصنيف تقليدية، بل هي مشكلة تنبؤ باحتمالية حدوث حدث معين في أوقات مختلفة، و النموذج يتنبأ باحتمالية البقاء أو احتمالية حدوث الحدث عبر الزمن.

Ali Ahmed55 · 4 فبراير

يعني الكود ده مبتدي تمام

# Construct 

# Separate features (X) and target variables (y)
x = data_train.drop(['efs', 'efs_time'], axis=1, inplace=False)  # Features (all columns except 'efs' and 'efs_time')
y_event = data_train['efs']  # First target variable (event outcome)
y_time = data_train['efs_time']  # Second target variable (event time)

# Step 1: Split data into 70% training and 30% temporary set (which will be further split)
x_train, x_temp, y_event_train, y_event_temp, y_time_train, y_time_temp = train_test_split(x, y_event, y_time, test_size=0.3, random_state=42)

# Step 2: Split the temporary set into 15% validation (dev) and 15% test
x_dev, x_test, y_event_dev, y_event_test, y_time_dev, y_time_test = train_test_split(x_temp, y_event_temp, y_time_temp, test_size=0.5, random_state=42)

# Print dataset sizes for verification
#print(f"X_train: {x_train.shape}, x_dev: {x_dev.shape}, X_test: {x_test.shape}")
#print(f"y_event_train: {y_event_train.shape}, y_event_dev: {y_event_dev.shape}, y_event_test: {y_event_test.shape}")
#print(f"y_time_train: {y_time_train.shape}, y_time_dev: {y_time_dev.shape}, y_time_test: {y_time_test.shape}")

NUM_DURATIONS = 10


# Step 3: Apply standard scaling to the features to standardize the data
scaler = StandardScaler()  # Initialize the scaler
x_train_scaled = scaler.fit_transform(x_train)  # Fit the scaler on the training data and transform it
x_test_scaled = scaler.transform(x_test)  # Transform the test data based on the scaler fit on the training data



# Define the input dimension based on the number of features in the training data
input_dim = x_train_scaled.shape[1]


# Define the input layer with the shape matching the feature dimension
inputs = keras.layers.Input(shape=(input_dim,))

# Add the first dense layer with 128 neurons and ReLU activation
# This layer processes the input data to extract complex features
x = keras.layers.Dense(128, activation='relu')(inputs)
x = keras.layers.BatchNormalization()(x)
# Add the second dense layer with 64 neurons and ReLU activation
# This further processes the output from the previous layer to capture more intricate patterns
x = keras.layers.Dense(64, activation='relu')(x)
x = keras.layers.BatchNormalization()(x)
# Add the third dense layer with 32 neurons and ReLU activation
# This layer continues refining the learned features from the previous layers
x = keras.layers.Dense(32, activation='relu')(x)

# The output layer for predicting the time-to-event intervals, using the 'softmax' activation function.
# This allows the model to predict the probability distribution over multiple time bins (e.g., different durations).
output_time = keras.layers.Dense(1, activation='relu', name='time-output')(x)

# The output layer for predicting the event outcome (e.g., whether the event occurred or was censored),
# using the 'sigmoid' activation function. This gives a probability value between 0 and 1.
output_event = keras.layers.Dense(1, activation='sigmoid', name='event-output')(x)

# Constructing the final model, which takes the 'inputs' and outputs both the time-to-event predictions
# and the event predictions. This is a multi-output model designed for survival analysis tasks.
deep_hit_model = keras.models.Model(inputs=inputs, outputs=[output_time, output_event])

# Compiling the Keras model with the specified optimizer, loss function, and metrics
deep_hit_model.compile(

    optimizer = tf.keras.optimizers.AdamW(
        learning_rate=0.001, 
        weight_decay=0.004,
        beta_1=0.9,
        beta_2=0.999,
        epsilon=1e-07,
        amsgrad=False,
        clipnorm=None,
        clipvalue=None,
        global_clipnorm=None,
        use_ema=False,
        ema_momentum=0.99,
        ema_overwrite_frequency=None,
        loss_scale_factor=None,
        gradient_accumulation_steps=None,
        name='adamw',),
    loss = {"time-output": "mean_squared_error", "event-output": "mean_squared_error"},
    metrics = {"time-output": "mean_absolute_error", "event-output": "mean_absolute_error"}

)


# Training the Keras model with the specified data, epochs, batch size, and callbacks
deep_hit_model.fit(
    x_train_scaled,
    {"time-output": y_time_train, "event-output": y_event_train},
    validation_data=(x_test_scaled, {"time-output": y_time_test, "event-output": y_event_test}),
    epochs=50,
    batch_size=128,
    callbacks=[keras.callbacks.EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)]
)

Mustafa Suleiman · 4 فبراير

بتاريخ 46 دقائق مضت قال Ali Ahmed55:

يعني الكود ده مبتدي تمام

# Construct 

# Separate features (X) and target variables (y)
x = data_train.drop(['efs', 'efs_time'], axis=1, inplace=False)  # Features (all columns except 'efs' and 'efs_time')
y_event = data_train['efs']  # First target variable (event outcome)
y_time = data_train['efs_time']  # Second target variable (event time)

# Step 1: Split data into 70% training and 30% temporary set (which will be further split)
x_train, x_temp, y_event_train, y_event_temp, y_time_train, y_time_temp = train_test_split(x, y_event, y_time, test_size=0.3, random_state=42)

# Step 2: Split the temporary set into 15% validation (dev) and 15% test
x_dev, x_test, y_event_dev, y_event_test, y_time_dev, y_time_test = train_test_split(x_temp, y_event_temp, y_time_temp, test_size=0.5, random_state=42)

# Print dataset sizes for verification
#print(f"X_train: {x_train.shape}, x_dev: {x_dev.shape}, X_test: {x_test.shape}")
#print(f"y_event_train: {y_event_train.shape}, y_event_dev: {y_event_dev.shape}, y_event_test: {y_event_test.shape}")
#print(f"y_time_train: {y_time_train.shape}, y_time_dev: {y_time_dev.shape}, y_time_test: {y_time_test.shape}")

NUM_DURATIONS = 10


# Step 3: Apply standard scaling to the features to standardize the data
scaler = StandardScaler()  # Initialize the scaler
x_train_scaled = scaler.fit_transform(x_train)  # Fit the scaler on the training data and transform it
x_test_scaled = scaler.transform(x_test)  # Transform the test data based on the scaler fit on the training data



# Define the input dimension based on the number of features in the training data
input_dim = x_train_scaled.shape[1]


# Define the input layer with the shape matching the feature dimension
inputs = keras.layers.Input(shape=(input_dim,))

# Add the first dense layer with 128 neurons and ReLU activation
# This layer processes the input data to extract complex features
x = keras.layers.Dense(128, activation='relu')(inputs)
x = keras.layers.BatchNormalization()(x)
# Add the second dense layer with 64 neurons and ReLU activation
# This further processes the output from the previous layer to capture more intricate patterns
x = keras.layers.Dense(64, activation='relu')(x)
x = keras.layers.BatchNormalization()(x)
# Add the third dense layer with 32 neurons and ReLU activation
# This layer continues refining the learned features from the previous layers
x = keras.layers.Dense(32, activation='relu')(x)

# The output layer for predicting the time-to-event intervals, using the 'softmax' activation function.
# This allows the model to predict the probability distribution over multiple time bins (e.g., different durations).
output_time = keras.layers.Dense(1, activation='relu', name='time-output')(x)

# The output layer for predicting the event outcome (e.g., whether the event occurred or was censored),
# using the 'sigmoid' activation function. This gives a probability value between 0 and 1.
output_event = keras.layers.Dense(1, activation='sigmoid', name='event-output')(x)

# Constructing the final model, which takes the 'inputs' and outputs both the time-to-event predictions
# and the event predictions. This is a multi-output model designed for survival analysis tasks.
deep_hit_model = keras.models.Model(inputs=inputs, outputs=[output_time, output_event])

# Compiling the Keras model with the specified optimizer, loss function, and metrics
deep_hit_model.compile(

    optimizer = tf.keras.optimizers.AdamW(
        learning_rate=0.001, 
        weight_decay=0.004,
        beta_1=0.9,
        beta_2=0.999,
        epsilon=1e-07,
        amsgrad=False,
        clipnorm=None,
        clipvalue=None,
        global_clipnorm=None,
        use_ema=False,
        ema_momentum=0.99,
        ema_overwrite_frequency=None,
        loss_scale_factor=None,
        gradient_accumulation_steps=None,
        name='adamw',),
    loss = {"time-output": "mean_squared_error", "event-output": "mean_squared_error"},
    metrics = {"time-output": "mean_absolute_error", "event-output": "mean_absolute_error"}

)


# Training the Keras model with the specified data, epochs, batch size, and callbacks
deep_hit_model.fit(
    x_train_scaled,
    {"time-output": y_time_train, "event-output": y_event_train},
    validation_data=(x_test_scaled, {"time-output": y_time_test, "event-output": y_event_test}),
    epochs=50,
    batch_size=128,
    callbacks=[keras.callbacks.EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)]
)

نموذج DeepHit يتطلب تحويل عمود الزمن efs_time إلى فئات زمنية (تصنيفات) بدلاً من التعامل معه كمشكلة تنبؤ انحدارية مباشرة.

أنت استخدمت نموذج تنبؤي تقليدي Regression بدلاً من التعامل مع مشكلة التصنيف متعددة الفئات الزمنية التي يتطلبها DeepHit.

كما ذكرت سابقًا، DeepHit يعتمد على تقسيم الزمن إلى فترات زمنية محددة مسبقًا، مما يحول مشكلة التنبؤ بالزمن المستمر إلى مشكلة تصنيفية متعددة الفئات، وتتم كالتالي:

import pandas as pd
import numpy as np

time_bins = np.percentile(y_time_train, np.linspace(0, 100, NUM_DURATIONS + 1))
time_bins[-1] = np.inf  

time_labels = list(range(NUM_DURATIONS))
y_time_train_binned = pd.cut(y_time_train, bins=time_bins, labels=time_labels, include_lowest=True)
y_time_test_binned = pd.cut(y_time_test, bins=time_bins, labels=time_labels, include_lowest=True)
y_time_dev_binned = pd.cut(y_time_dev, bins=time_bins, labels=time_labels, include_lowest=True)

y_time_train_binned = y_time_train_binned.astype(int)
y_time_test_binned = y_time_test_binned.astype(int)
y_time_dev_binned = y_time_dev_binned.astype(int)

بعد تقسيم الزمن إلى فئات، يجب تعديل النموذج ليكون قادرًا على توقع احتمالية وقوع الحدث في كل فئة زمنية، أي تعديل طبقة الإخراج لنموذج Keras:

from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.layers.Input(shape=(input_dim,))

x = keras.layers.Dense(128, activation='relu')(inputs)
x = keras.layers.BatchNormalization()(x)
x = keras.layers.Dense(64, activation='relu')(x)
x = keras.layers.BatchNormalization()(x)
x = keras.layers.Dense(32, activation='relu')(x)

output_time = keras.layers.Dense(NUM_DURATIONS, activation='softmax', name='time-output')

output_event = keras.layers.Dense(1, activation='sigmoid', name='event-output')

deep_hit_model = keras.models.Model(inputs=inputs, outputs=[output_time, output_event])

deep_hit_model.compile(
    optimizer=keras.optimizers.AdamW(
        learning_rate=0.001,
        weight_decay=0.004
    ),
    loss={
        "time-output": "categorical_crossentropy",  
        "event-output": "binary_crossentropy"     
    },
    metrics={
        "time-output": "accuracy",
        "event-output": "accuracy"
    }
)

وليتوافق مع categorical_crossentropy، يجب تحويل فئات الزمن إلى تمثيل One-Hot.

from tensorflow.keras.utils import to_categorical

y_time_train_encoded = to_categorical(y_time_train_binned, num_classes=NUM_DURATIONS)
y_time_test_encoded = to_categorical(y_time_test_binned, num_classes=NUM_DURATIONS)
y_time_dev_encoded = to_categorical(y_time_dev_binned, num_classes=NUM_DURATIONS)

ثم تعديل دالة التدريب لتعكس الفئات المحوّلة:

deep_hit_model.fit(
    x_train_scaled,
    {
        "time-output": y_time_train_encoded,
        "event-output": y_event_train
    },
    validation_data=(
        x_test_scaled,
        {
            "time-output": y_time_test_encoded,
            "event-output": y_event_test
        }
    ),
    epochs=50,
    batch_size=128,
    callbacks=[keras.callbacks.EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)]
)

Ali Ahmed55 · 4 فبراير

بتاريخ 1 دقيقة مضت قال Mustafa Suleiman:

نموذج DeepHit يتطلب تحويل عمود الزمن efs_time إلى فئات زمنية (تصنيفات) بدلاً من التعامل معه كمشكلة تنبؤ انحدارية مباشرة.

أنت استخدمت نموذج تنبؤي تقليدي Regression بدلاً من التعامل مع مشكلة التصنيف متعددة الفئات الزمنية التي يتطلبها DeepHit.

كما ذكرت سابقًا، DeepHit يعتمد على تقسيم الزمن إلى فترات زمنية محددة مسبقًا، مما يحول مشكلة التنبؤ بالزمن المستمر إلى مشكلة تصنيفية متعددة الفئات، وتتم كالتالي:

import pandas as pd
import numpy as np

time_bins = np.percentile(y_time_train, np.linspace(0, 100, NUM_DURATIONS + 1))
time_bins[-1] = np.inf  

time_labels = list(range(NUM_DURATIONS))
y_time_train_binned = pd.cut(y_time_train, bins=time_bins, labels=time_labels, include_lowest=True)
y_time_test_binned = pd.cut(y_time_test, bins=time_bins, labels=time_labels, include_lowest=True)
y_time_dev_binned = pd.cut(y_time_dev, bins=time_bins, labels=time_labels, include_lowest=True)

y_time_train_binned = y_time_train_binned.astype(int)
y_time_test_binned = y_time_test_binned.astype(int)
y_time_dev_binned = y_time_dev_binned.astype(int)

بعد تقسيم الزمن إلى فئات، يجب تعديل النموذج ليكون قادرًا على توقع احتمالية وقوع الحدث في كل فئة زمنية، أي تعديل طبقة الإخراج لنموذج Keras:

from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.layers.Input(shape=(input_dim,))

x = keras.layers.Dense(128, activation='relu')(inputs)
x = keras.layers.BatchNormalization()(x)
x = keras.layers.Dense(64, activation='relu')(x)
x = keras.layers.BatchNormalization()(x)
x = keras.layers.Dense(32, activation='relu')(x)

output_time = keras.layers.Dense(NUM_DURATIONS, activation='softmax', name='time-output')

output_event = keras.layers.Dense(1, activation='sigmoid', name='event-output')

deep_hit_model = keras.models.Model(inputs=inputs, outputs=[output_time, output_event])

deep_hit_model.compile(
    optimizer=keras.optimizers.AdamW(
        learning_rate=0.001,
        weight_decay=0.004
    ),
    loss={
        "time-output": "categorical_crossentropy",  
        "event-output": "binary_crossentropy"     
    },
    metrics={
        "time-output": "accuracy",
        "event-output": "accuracy"
    }
)

وليتوافق مع categorical_crossentropy، يجب تحويل فئات الزمن إلى تمثيل One-Hot.

from tensorflow.keras.utils import to_categorical

y_time_train_encoded = to_categorical(y_time_train_binned, num_classes=NUM_DURATIONS)
y_time_test_encoded = to_categorical(y_time_test_binned, num_classes=NUM_DURATIONS)
y_time_dev_encoded = to_categorical(y_time_dev_binned, num_classes=NUM_DURATIONS)

ثم تعديل دالة التدريب لتعكس الفئات المحوّلة:

deep_hit_model.fit(
    x_train_scaled,
    {
        "time-output": y_time_train_encoded,
        "event-output": y_event_train
    },
    validation_data=(
        x_test_scaled,
        {
            "time-output": y_time_test_encoded,
            "event-output": y_event_test
        }
    ),
    epochs=50,
    batch_size=128,
    callbacks=[keras.callbacks.EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)]
)

تمام جدا

بس حتي لو المطلوب هو التنبوء بقيمه الefs ؟

Mustafa Suleiman · 4 فبراير

بتاريخ 8 دقائق مضت قال Ali Ahmed55:

تمام جدا

بس حتي لو المطلوب هو التنبوء بقيمه الefs ؟

تستطيع ذلك، لكن الأفضل استخدامه في تحليل البقاء بدلاً من التنبؤ الانحداري المباشر، حيث DeepHit مُصمم بشكل أساسي لتحليل البقاء والتعامل مع البيانات المحتسبة والتحديات المرتبطة بها، الأمر الذي يجعله مناسبًا لتقدير دوال البقاء وتوزيعات الحدث عبر الزمن.

ولو هدفك الأساسي هو التنبؤ المباشر بقيمة زمنية مستمرة مثل efs_time، فاستخدام نماذج الانحدار التقليدية أو المتقدمة أكثر ملاءمة، خاصة إن لم تتضمن بياناتك حالات محتسبة.

Ali Ahmed55 · 4 فبراير

بتاريخ 8 دقائق مضت قال Mustafa Suleiman:

تستطيع ذلك، لكن الأفضل استخدامه في تحليل البقاء بدلاً من التنبؤ الانحداري المباشر، حيث DeepHit مُصمم بشكل أساسي لتحليل البقاء والتعامل مع البيانات المحتسبة والتحديات المرتبطة بها، الأمر الذي يجعله مناسبًا لتقدير دوال البقاء وتوزيعات الحدث عبر الزمن.

ولو هدفك الأساسي هو التنبؤ المباشر بقيمة زمنية مستمرة مثل efs_time، فاستخدام نماذج الانحدار التقليدية أو المتقدمة أكثر ملاءمة، خاصة إن لم تتضمن بياناتك حالات محتسبة.

اه يعني الافضل ان اتنباء بقيمه الefs من خلال تقسم الefs_time الي فئات صح كده انا فهم كده صح ؟

Ali Ahmed55 · 4 فبراير

ايوه يعني في الكود الحضرتك عملو يكون ال perd_event يكون 0 او 1 والا من 0 الي 1 ؟

لاستخدام نموذج الـ DeepHit لازم احول عمود الـ efs_time الي فئات ؟

السؤال

Ali Ahmed55

9 أجوبة على هذا السؤال

Recommended Posts

محمد عاطف17

Ali Ahmed55

Chihab Hedidi

Ali Ahmed55

Mustafa Suleiman

Ali Ahmed55

Mustafa Suleiman

Ali Ahmed55

Ali Ahmed55

انضم إلى النقاش

إعلانات

تابعنا على

الرئيسية

كيف أتعلم؟

تابعنا

دروس ومقالات

أسئلة وأجوبة

كتب

دورات

بطاقات هدية